pysal / giddy Goto Github PK

View Code? Open in Web Editor NEW

72.0 22.0 24.0 41.6 MB

Exploratory spatiotemporal data analysis and Geospatial distribution dynamics analysis

Home Page: http://pysal.org/giddy/

License: BSD 3-Clause "New" or "Revised" License

Python 1.93% Jupyter Notebook 98.07%

spatial-statistics spatiotemporal-analysis distribution-dynamics markov-chains

giddy's Introduction

Python Spatial Analysis Library

PySAL, the Python spatial analysis library, is an open source cross-platform library for geospatial data science with an emphasis on geospatial vector data written in Python. It supports the development of high level applications for spatial analysis, such as

detection of spatial clusters, hot-spots, and outliers
construction of graphs from spatial data
spatial regression and statistical modeling on geographically embedded networks
spatial econometrics
exploratory spatio-temporal data analysis

PySAL Components

PySAL is a family of packages for spatial data science and is divided into four major components:

Lib

solve a wide variety of computational geometry problems including graph construction from polygonal lattices, lines, and points, construction and interactive editing of spatial weights matrices & graphs - computation of alpha shapes, spatial indices, and spatial-topological relationships, and reading and writing of sparse graph data, as well as pure python readers of spatial vector data. Unike other PySAL modules, these functions are exposed together as a single package.

libpysal : libpysal provides foundational algorithms and data structures that support the rest of the library. This currently includes the following modules: input/output (io), which provides readers and writers for common geospatial file formats; weights (weights), which provides the main class to store spatial weights matrices, as well as several utilities to manipulate and operate on them; computational geometry (cg), with several algorithms, such as Voronoi tessellations or alpha shapes that efficiently process geometric shapes; and an additional module with example data sets (examples).

Explore

The explore layer includes modules to conduct exploratory analysis of spatial and spatio-temporal data. At a high level, packages in explore are focused on enabling the user to better understand patterns in the data and suggest new interesting questions rather than answer existing ones. They include methods to characterize the structure of spatial distributions (either on networks, in continuous space, or on polygonal lattices). In addition, this domain offers methods to examine the dynamics of these distributions, such as how their composition or spatial extent changes over time.

esda : esda implements methods for the analysis of both global (map-wide) and local (focal) spatial autocorrelation, for both continuous and binary data. In addition, the package increasingly offers cutting-edge statistics about boundary strength and measures of aggregation error in statistical analyses
giddy : giddy is an extension of esda to spatio-temporal data. The package hosts state-of-the-art methods that explicitly consider the role of space in the dynamics of distributions over time
inequality : inequality provides indices for measuring inequality over space and time. These comprise classic measures such as the Theil T information index and the Gini index in mean deviation form; but also spatially-explicit measures that incorporate the location and spatial configuration of observations in the calculation of inequality measures.
momepy : momepy is a library for quantitative analysis of urban form - urban morphometrics. It aims to provide a wide range of tools for a systematic and exhaustive analysis of urban form. It can work with a wide range of elements, while focused on building footprints and street networks. momepy stands for Morphological Measuring in Python.
pointpats : pointpats supports the statistical analysis of point data, including methods to characterize the spatial structure of an observed point pattern: a collection of locations where some phenomena of interest have been recorded. This includes measures of centrography which provide overall geometric summaries of the point pattern, including central tendency, dispersion, intensity, and extent.
segregation : segregation package calculates over 40 different segregation indices and provides a suite of additional features for measurement, visualization, and hypothesis testing that together represent the state-of-the-art in quantitative segregation analysis.
spaghetti : spaghetti supports the the spatial analysis of graphs, networks, topology, and inference. It includes functionality for the statistical testing of clusters on networks, a robust all-to-all Dijkstra shortest path algorithm with multiprocessing functionality, and high-performance geometric and spatial computations using geopandas that are necessary for high-resolution interpolation along networks, and the ability to connect near-network observations onto the network

Model

In contrast to explore, the model layer focuses on confirmatory analysis. In particular, its packages focus on the estimation of spatial relationships in data with a variety of linear, generalized-linear, generalized-additive, nonlinear, multi-level, and local regression models.

mgwr : mgwr provides scalable algorithms for estimation, inference, and prediction using single- and multi-scale geographically-weighted regression models in a variety of generalized linear model frameworks, as well model diagnostics tools
spglm : spglm implements a set of generalized linear regression techniques, including Gaussian, Poisson, and Logistic regression, that allow for sparse matrix operations in their computation and estimation to lower memory overhead and decreased computation time.
spint : spint provides a collection of tools to study spatial interaction processes and analyze spatial interaction data. It includes functionality to facilitate the calibration and interpretation of a family of gravity-type spatial interaction models, including those with production constraints, attraction constraints, or a combination of the two.
spreg : spreg supports the estimation of classic and spatial econometric models. Currently it contains methods for estimating standard Ordinary Least Squares (OLS), Two Stage Least Squares (2SLS) and Seemingly Unrelated Regressions (SUR), in addition to various tests of homokestadicity, normality, spatial randomness, and different types of spatial autocorrelation. It also includes a suite of tests for spatial dependence in models with binary dependent variables.
spvcm : spvcm provides a general framework for estimating spatially-correlated variance components models. This class of models allows for spatial dependence in the variance components, so that nearby groups may affect one another. It also also provides a general-purpose framework for estimating models using Gibbs sampling in Python, accelerated by the numba package.
tobler : tobler provides functionality for for areal interpolation and dasymetric mapping. Its name is an homage to the legendary geographer Waldo Tobler a pioneer of dozens of spatial analytical methods. tobler includes functionality for interpolating data using area-weighted approaches, regression model-based approaches that leverage remotely-sensed raster data as auxiliary information, and hybrid approaches.
access : access aims to make it easy for analysis to calculate measures of spatial accessibility. This work has traditionally had two challenges: [1] to calculate accurate travel time matrices at scale and [2] to derive measures of access using the travel times and supply and demand locations. access implements classic spatial access models, allowing easy comparison of methodologies and assumptions.
spopt: spopt is an open-source Python library for solving optimization problems with spatial data. Originating from the original region module in PySAL, it is under active development for the inclusion of newly proposed models and methods for regionalization, facility location, and transportation-oriented solutions.

Viz

The viz layer provides functionality to support the creation of geovisualisations and visual representations of outputs from a variety of spatial analyses. Visualization plays a central role in modern spatial/geographic data science. Current packages provide classification methods for choropleth mapping and a common API for linking PySAL outputs to visualization tool-kits in the Python ecosystem.

legendgram : legendgram is a small package that provides "legendgrams" legends that visualize the distribution of observations by color in a given map. These distributional visualizations for map classification schemes assist in analytical cartography and spatial data visualization
mapclassify : mapclassify provides functionality for Choropleth map classification. Currently, fifteen different classification schemes are available, including a highly-optimized implementation of Fisher-Jenks optimal classification. Each scheme inherits a common structure that ensures computations are scalable and supports applications in streaming contexts.
splot : splot provides statistical visualizations for spatial analysis. It methods for visualizing global and local spatial autocorrelation (through Moran scatterplots and cluster maps), temporal analysis of cluster dynamics (through heatmaps and rose diagrams), and multivariate choropleth mapping (through value-by-alpha maps. A high level API supports the creation of publication-ready visualizations

Installation

PySAL is available through Anaconda (in the defaults or conda-forge channel) We recommend installing PySAL from conda-forge:

conda config --add channels conda-forge
conda install pysal

PySAL can also be installed using pip:

pip install pysal

As of version 2.0.0 PySAL has shifted to Python 3 only.

Users who need an older stable version of PySAL that is Python 2 compatible can install version 1.14.3 through pip or conda:

conda install pysal==1.14.3

Documentation

For help on using PySAL, check out the following resources:

Development

As of version 2.0.0, PySAL is now a collection of affiliated geographic data science packages. Changes to the code for any of the subpackages should be directed at the respective upstream repositories, and not made here. Infrastructural changes for the meta-package, like those for tooling, building the package, and code standards, will be considered.

Development is hosted on github.

Discussions of development as well as help for users occurs on the developer list as well as in PySAL's Discord channel.

Getting Involved

If you are interested in contributing to PySAL please see our development guidelines.

Bug reports

To search for or report bugs, please see PySAL's issues.

Build Instructions

To build the meta-package pysal see tools/README.md.

License information

See the file "LICENSE.txt" for information on the history of this software, terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES.

giddy's People

Contributors

Stargazers

Watchers

giddy's Issues

initialization of ECount parameter in Kendall's Tau

A bug in pysal.spatial_dynamics.rank class Tau: def _calc(self, x, y). Ties are counted incorrectly. Cause: initialization is made with ECount = 0, but should be ECount = 1

Release version 2.2.2 for bug fix

giddy 2.2.2 should be released soon to incorporate the bug fix to the rank method #118

rename repos/project

How about instead of giddy, we go with

GeoDiDy = Geospatial DIstribution DYnamics ?

Binder for examples is missing dependencies

For example directional.ipynb

--> 276     import matplotlib
    277 
    278     if gui and gui != 'auto':

ModuleNotFoundError: No module named 'matplotlib'

How to use spatial_markov model by adopting my own data

Hi, everyone, recently i use the pysal package to implement the spatial_markov. I notice the struction document adopt the "us48.gal", however, the document doesn't give how it form. So, there i have two questions needed to answer.
(1), how i get the us48.gal, in other word, how is the us48 file formed?
(2), i want to use the spatial_markov by importing my own data, how should i do?
I so looking forward to receiving your answer.
The official document has listed as following:
https://giddy.readthedocs.io/en/latest/MarkovBasedMethods.html

KeyError: 1 in spatial_lag.py

Hi,

may I start by apologising if this isn't the right place to raise this issue. This might not be a 'bug' and I thought of putting this on the opensapce-list, let me know if that's more appropriate.

I am using the new feature of giddy that have the Markov states already discretised. The .gal file was made in GeoDa, edited by me to connect some islands, and then re-used in GeoDa to check that the new connections worked. When I read the .gal file using giddy it appears to read in with no errors. However when I run the spatial Markov code I get the following traceback. There is little guidance on what the issue might be, A KeyError means that something (‘1’?) is missing?

Traceback (most recent call last):

File "", line 1, in
runfile('D:/SDC/TOAC/giddy/TOAC.py', wdir='D:/SDC/TOAC/giddy')

File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 710, in runfile
execfile(filename, namespace)

File "C:\ProgramData\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 101, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "D:/SDC/TOAC/giddy/TOAC.py", line 20, in
sm=gd.Spatial_Markov(q8, w, discrete=True)

File "C:\ProgramData\Anaconda3\lib\site-packages\giddy-1.0.2-py3.6.egg\giddy\markov.py", line 409, in init
T, P = self._calc(y, w, classes, k=k)

File "C:\ProgramData\Anaconda3\lib\site-packages\giddy-1.0.2-py3.6.egg\giddy\markov.py", line 518, in _calc
ly = ps.lag_categorical(w,y)

File "C:\ProgramData\Anaconda3\lib\site-packages\libpysal\weights\spatial_lag.py", line 160, in lag_categorical
return np.vstack([lag_categorical(w,col) for col in y.T]).T

File "C:\ProgramData\Anaconda3\lib\site-packages\libpysal\weights\spatial_lag.py", line 160, in
return np.vstack([lag_categorical(w,col) for col in y.T]).T

File "C:\ProgramData\Anaconda3\lib\site-packages\libpysal\weights\spatial_lag.py", line 172, in lag_categorical
output[w.id2i[int(idx)]] = keys[int(outidx)]

KeyError: 1

development guidelines link in README.md

As per pysal/libpysal#178 and pysal/libpysal#181, "development guidelines" link in README.md is broken in the Contribute section.

from: ~~http://pysal.readthedocs.io/en/latest/developers/index.html~~
to: https://github.com/pysal/pysal/wiki

Notebook links on landing page are broken

Clicking on any of the three images at https://pysal.org/giddy/ will get a 404.
Looks like the change from master to main didn't make it into the docs that get rendered?

docs building is failing

docs building is failing https://readthedocs.org/projects/giddy/builds/9427109/

how use my own data

Hi, everyone, recently i use the pysal package to implement the spatial_markov. I notiice the structions document adopt the "us48.gal", however, the document don't give how it form. So, there i have two questions needed to answer.
(1), how i get the us48.gal, in other word, how is the us48 file formed?
(2), i want to use the spatial_markov by importing my own data, how should i do?
I so looking forward to receiving your answer.
The offial docement has listed as following:
https://giddy.readthedocs.io/en/latest/MarkovBasedMethods.html

How to use the 'development' version

Hello,

this follows on from my earlier query. I find the process of gaining access to a new version of the package that may solve my issue convoluted. Firstly I was pointed top this web page : https://github.com/pysal/pysal/wiki/GitHub-Standard-Operating-Procedures, but after: forking pysal; installing Git; and typing 'git clone https://github.com/tra6sdc/pysal.git' I get this in the window :

_steph_000@HOMEPC MINGW64 ~
$ git clone https://github.com/tra6sdc/pysal.git
Cloning into 'pysal'...
remote: Counting objects: 21397, done.
remote: Total 21397 (delta 0), reused 0 (delta 0), pack-reused 21397
Receiving objects: 100% (21397/21397), 57.91 MiB | 5.52 MiB/s, done.
Resolving deltas: 100% (13481/13481), done.
Checking out files: 100% (888/888), done.

steph_000@HOMEPC MINGW64 ~
$_

Now what? Is there now a local version of pysal somewhere on my PC? and if so, inside Anaconda/Spyder what do I do to access this local version? I am even unclear if I need to get pysal, libpysal or giddy? Is there a simple step-by-step list of instructions to do what I want, that doesn't assume that I have been steeped in git/pysal of years?

Alternatively, is there a timescale for a release of the new version? Thanks.

giddy import fails due to numba missing `generated_jit`

ImportError: cannot import name 'generated_jit' from 'numba' (/home/serge/miniforge3/envs/workshop-pysal-wrsa24/lib/python3.10/site-packages/numba/__init__.py)

This is for giddy 2.3.5 pyhd8ed1ab_0 conda-forge

Possibly related

.travis built on archaic Miniconda

.travis.yml potentially uses an archaic version of Miniconda dating back to 2015. The newer versions are explicitly labeled Miniconda2 or Miniconda3. Within .travis.yml, this should be updated to:

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh

fail gracefully when `splot` not installed

I am seeing 2 failures locally when splot is not installed. Seems like we should try to get these failures to be bit more graceful or prevent them from happening.

FAILED giddy/tests/test_directional.py::Rose_Tester::test_rose - TypeError: cannot unpack non-iterable NoneType object
FAILED giddy/tests/test_directional.py::i::test_rose - TypeError: cannot unpack non-iterable NoneType object

build notebooks for documentation

setup travis tests

Related to "giddy.markov.Homogeneity_Results" command

Hello, I am experiencing difficulties with "giddy.markov.Homogeneity_Results" command and wondering if I can get some help from giddy developers.

I am using the income information to estimate two classic Markov transition matrices (discrete Markov matrix using 5 quantiles) for two different periods. The first period is from the year 1981 to 1996 and the second period is from the year 2001 to 2016.

Below are the two transition probability matrices that I estimated. cm_p1 is a classic Markov's transition matrix using the income information of the period 1981-1996 and cm_p2 is a matrix using the income information of the period 2001-2016.

I tried to run the Homogeneity test (LR test and chi-square test) between these two Markov transition matrices using the "giddy.markov.Homogeneity_Results" command.

Based on the above information, it seems like I can conduct the homogeneity test using the "giddy.markov.Homogeneity_Results(cm_p1.p, cm_p2.p)" code. But when I run it, it shows an error message saying "not enough values to unpack (expected 3, got 2)"

I am a bit confused about what caused this error. Could anyone possibly help me with this problem?

Publish on PyPi

Prepare for a new realase before the Jan meta-package release

prepare giddy v2.3.4 supporting python 3.7-3.10 before meta-package release in Jan 2023

fix import structure

separate from pysal

ValueError: setting an array element with a sequence

CI is failing for the following tests:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions.

environment.yml out of date

the dependencies arent accurate (and dont match pyproject.toml)

set up travis dual testing against mapclassify and esda

Currently, travis CI is configured to test against libpysal on pypi and github. Since mapclassify and esda are also dependencies for giddy, dual testing should also incorporate them.

remove api.py

remove api.py to be consistent with other pysal submodules.

update supported for Python version

We should add in testing environments for Python 3.11.

Links broken with https://github.com/pysal/giddy/tutorial

Links broken to the source notebooks within https://github.com/pysal/giddy/tutorial.

i.e.

This page was generated from doc/notebooks/MarkovBasedMethods.ipynb

I will push up a fix for this based on what I did in spaghetti shortly.

swap out `setup.py` for `pyproject.toml`

Following the merge of #199, we'll want to modernize the installation infrastructure.

xref:

@weikang9009 Does this sound OK to you?

change master to main

We should update the master branch to the main branch to conform with the push for reducing problematic language.

encoding issue in README.rst

on a clean ubuntu box with python 3.5 (and 3.6) i'm hitting this:

root@334aa6b92338:/# source activate test-env0
(test-env0) root@334aa6b92338:/# pip install giddy
Collecting giddy
  Downloading https://files.pythonhosted.org/packages/14/ba/7455a5d0f6bb50d46fcca0b10a37c038ce26a886d6bbc11d9b09cef0cc88/giddy-1.1.1.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-wx3i8mg2/giddy/setup.py", line 18, in <module>
        long_description = file.read()
      File "/miniconda/envs/test-env0/lib/python3.5/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1891: ordinal not in range(128)

update README.md

Two minor issues need to be addressed:

the link for Python Package Index goes to the pysal release, not the giddy release
Eli's last name is not capitalized in the BibTeX citation

notebook links are returning 404

https://pysal.org/giddy/

Spatial_Markov gives its own set of 'k' classes even though explicitly mentioned as paramter value

The data has a set of 4 classes. But on calling
giddy.markov.Spatial_Markov(rpci, qW, fixed = True, k = 4), it gives out only 2 classes.
I don't really understand why it gives its own custom number of classes.

Ability to use a generated spatial markov chain to predict the next n states

It would be great to have a method to use a trained spatial Markov chain to predict the next n states for a set of geometries.

sm = pysal.Spatial_Markov(data, weights, fixed = True, k = 7)
prediction = sm.predict( inital_state,  weights= weights, steps=10, k=7)

where inital_state would be a list of the current state of each regions and prediction would be a list of lists which would contain the predictions for the next n steps for each region.

In addition there could be a function

prediction = sm.predict_ensembl( inital_state,  weights= weights, steps=10, trials=100, k=7)

where trials is the number of randomly seeded predictions to make. The result would be a list of the mean and variance of the predictions for the next N steps

reference labels are missing from rendered docs

This is happening in giddy and elsewhere in pysal using the sphinx build:

release on conda-forge

after checking with @weikang9009, I put the metadata together for a release on conda-forge, and i think things are in pretty good shape. If possible, would @weikang9009 and @sjsrey mind replying to this issue to confirm you're willing to be listed as maintainers of the recipe? If not, i'm still happy to do it solo

Singular matrix when computing ergodic values

Hello, I have made some progress on my previous issue, with the help of a colleague to get libpysal 3.0.3 installed. I now get the message below when I try to print the spatial ERGODIC values. It would be good if I could be provided with some pointers as to what is causing the matrix singularity? I am happy to share the input data and code. Thanks.

Spatial ERGODIC
C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\giddy-1.0.2-py3.6.egg\giddy\markov.py:438: ComplexWarning: Casting complex values to real discards the imaginary part
S[i] = STEADY_STATE(p)
Traceback (most recent call last):

File "", line 1, in
runfile('E:/TOAC.py', wdir='E:')

File "C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 710, in runfile
execfile(filename, namespace)

File "C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 101, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "E:/TOAC.py", line 37, in
for fq in sm.F:

File "C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\giddy-1.0.2-py3.6.egg\giddy\markov.py", line 447, in F
F[i] = fmpt(np.asmatrix(p))

File "C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\giddy-1.0.2-py3.6.egg\giddy\ergodic.py", line 106, in fmpt
Z = la.inv(I - P + A)

File "C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\linalg\linalg.py", line 513, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)

File "C:\Users\tra6sdc\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\linalg\linalg.py", line 90, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")

LinAlgError: Singular matrix

Extend functions for Markov classes to deal with non-ergodic Markov chains

It is possible that we run into non-ergodic Markov chains, e.g. it might be impossible to migrate from one Markov state to another. Right now, the steady_state function for calculating the limiting distribution and the fmpt function calculating first mean passage times can only deal with non-ergodic Markov chains. It will be useful to extend these functions to deal with non-ergodic Markov chains.

CI needs redesign

Last time CI was successfully run was 9 months ago. After this long period of no activity, much has changed in the dependency stack and the old ci has a lot of rough edges.

We need to update the CI so new enhancements can be added to the package.

output for classic Markov needs slight rewording

The Markov Chain is irreducible and is composed by:
1 Recurrent class (indices):
[0 1 2 3 4]
0 Transient class.
The Markov Chain has 0 absorbing state.

Should read:

The Markov Chain is irreducible and is composed by:
1 Recurrent class (indices):
[0 1 2 3 4]

spatial_dynamics.interaction migration?

Where did the spatial_dynamics.interaction functions relocate to in the PySAL migration?

v2.3.4 release

I'd like to cut a fresh release (v2.3.4) for this month's PySAL meta-release.

Several items to decide for this release:

Do we want to add gha workflow for publishing docs? #179
Do we want to replace the current name giddy with estda (exploratory spatial-temporal data analysis) to increase its visibility?
Anything else?

Is it good practice to reuse a function name as the name of a class attribute?

Just curious: is it good practice to reuse a function name as the name of a class attribute?

An example is steady_state:

steady_state is a function in giddy and can be imported as from giddy.ergodic import steady_state
steady_state is an attribute of class Markov in giddy and can be used as:

from giddy.ergodic import Markov
mc = Markov(y)
mc.steady_state