Code Monkey home page Code Monkey logo

fuzzy's Introduction

LingPy: A Python Library for Automatic Tasks in Historical Linguistics

This repository contains the Python package lingpy which can be used for various tasks in computational historical linguistics.

Build Status DOI PyPI version Documentation

Authors (Version 2.6.12): Johann-Mattis List and Robert Forkel

Collaborators: Christoph Rzymski, Simon J. Greenhill, Steven Moran, Peter Bouda, Johannes Dellert, Taraka Rama, Tiago Tresoldi, Gereon Kaiping, Frank Nagel, and Patrick Elmer.

LingPy is a Python library for historical linguistics. It is being developed for Python 2.7 and Python 3.x using a single codebase.

Quick Installation

For our latest stable version, you can simply use pip or easy_install for installation:

$ pip install lingpy

or

$ pip install lingpy

Depending on which easy_install or pip version you use, either the Python2 or the Python3 version of LingPy will be installed.

If you want to install the current GitHub version of LingPy on your system, open a terminal and type in the following:

$ git clone https://github.com/lingpy/lingpy/
$ cd lingpy
$ python setup.py install

If the last command above returns you some error regarding user permissions (usually "Errno 13"), you can install LingPy in your home Python setup:

$ python setup.py install --user

In order to use the library, start an interactive Python session and import LingPy as follows:

>>> from lingpy import *

To install LingPy to hack on it, fork the repository on GitHub, open a terminal and type:

$ git clone https://github.com/<your-github-user>/lingpy/
$ cd lingpy
$ python setup.py develop

This will install LingPy in "development mode", i.e. you will be able edit the sources in the cloned repository and import the altered code just as the regular Python package.

fuzzy's People

Contributors

fredericblum avatar lingulist avatar

Watchers

 avatar  avatar  avatar

fuzzy's Issues

Cannot import `ntile` from lingrex.fuzzy

@LinguList Running the analysis fails, as I cannot import the function ntile from lingrex:

Traceback (most recent call last):
  File "/Users/blum/Projects/fuzzy/analysis.py", line 4, in <module>
    from lingrex.fuzzy import FuzzyReconstructor, ntile
ImportError: cannot import name 'ntile' from 'lingrex.fuzzy' (/Users/blum/Projects/venv/test/lib/python3.9/site-packages/lingrex/fuzzy.py)
make: *** [burmish-reconstruction] Error 1

Checking back on lingrex, there is indeed no function called ntile. Is there a naming problem? Or has it gone missing while transferring the code to lingrex?

Taking advantage of manual alignments

@LinguList I am currently trying to get the code running for the manual alignments of the Panoan data. However, the transform_alignment function from lingrex overwrites the alignments in all cases. Since there is no code documentation, I struggle to solve this. Among other things, I tried manipulating the gap-argument of the function to maintain the -, but it did not work.

Here is an example: Cogid 242, after reducing the alignment in panoan_prep-py:

p a - n o
p a - n u
p a ɨ rⁿ o
p a - n o

Align=False: All gaps are removed, and the alignment is newly created in linear fashion (NOT expected for me)
Accuracy: 73%

language concept pos S1 S2 S3 S4 S5
Amawaka p a n o
Chakobo Ø Ø Ø Ø
Chaninawa Ø Ø Ø Ø
Kakataibo p a ɨ -
Kapanawa Ø Ø Ø Ø
Katukina p a n u
Kaxarari Ø Ø Ø Ø
Kaxinawa p a n u
Korubo Ø Ø Ø Ø
Marinawa p a n o
Marubo p a n o
Matis Ø Ø Ø Ø
Mayoruna p a n u
Poyanawa Ø Ø Ø Ø
Shanenawa Ø Ø Ø Ø
Sharanawa Ø Ø Ø Ø
ShipiboKonibo p a n o
Yaminawa p a n o
Yawanawa p a n u
Fuzzy p:100 a:100 ɨ:100 rⁿ.o:100
Proto-Panoan p a ɨ rⁿ o

Align=True: A new alignment is created, ignoring the manual input (as perhaps expected, using True as value)
Accuracy: 80%

language concept pos S1 S2 S3 S4 S5
Amawaka p a n o
Chakobo Ø Ø Ø Ø
Chaninawa Ø Ø Ø Ø
Kakataibo p a - ɨ
Kapanawa Ø Ø Ø Ø
Katukina p a n u
Kaxarari Ø Ø Ø Ø
Kaxinawa p a n u
Korubo Ø Ø Ø Ø
Marinawa p a n o
Marubo p a n o
Matis Ø Ø Ø Ø
Mayoruna p a n u
Poyanawa Ø Ø Ø Ø
Shanenawa Ø Ø Ø Ø
Sharanawa Ø Ø Ø Ø
ShipiboKonibo p a n o
Yaminawa p a n o
Yawanawa p a n u
Fuzzy p:100 a.ɨ:80¦a:20 rⁿ:100 o:100
Proto-Panoan p a ɨ rⁿ o

The accuracy is still higher than for the other two datasets. However, I think that we lose some potential here if we don't find a way of making use of the manual work put in. How should we proceed?

Length of alignments

@LinguList As described in Figure 2, we offer an information on alignment length. As it is not 100% clear to me what the number refers to, I have difficulties writing that code myself.

the number of words of which alignments for individual proto-forms are reconstructed - the average number of words in each cognate set (multiple per cognate set)? The average length of the alignments of each cognate set (one per cognate set)?

Concatenating make-commands fails

The current setup int he MAKEFILE defines the following type of command:

pano-wordlist:
	edictor wordlist --data=cldf-data/oliveiraprotopanoan/cldf/cldf-metadata.json --preprocessing=data/panoan_prep.py --addon="cognacy:cogid","alignment:alignment" --name=data/panoan
pano-reconstruction:
	python example-1.py Panoan
pano-pdf:
	pandoc -i panoan.md -o panoan.pdf --pdf-engine=xelatex
pano: pano-wordlist, pano-reconstruction, pano-pdf

Individually, the commands pano-wordlist, pano-reconstruction, and pano-pdf run fine. But the final command, pano, throws the following error:

(fuzzy) blum@lingn45 examples % make pano             
make: *** No rule to make target `pano-wordlist,', needed by `pano'.  Stop.
(fuzzy) blum@lingn45 examples % make pano-wordlist
edictor wordlist --data=cldf-data/oliveiraprotopanoan/cldf/cldf-metadata.json --preprocessing=data/panoan_prep.py --addon="cognacy:cogid","alignment:alignment" --name=data/panoan
2023-10-12 09:31:13,058 [INFO] Data has been written to file <data/panoan.tsv>.

Am I missing something? Should we rather create a single command for each language, that runs all the code?

Lacking python 3.11 support for MacOS

@LinguList When trying to install the requirements, the installation fails for the lxml package:

Building wheels for collected packages: lxml
  Building wheel for lxml (pyproject.toml) ... error
  error: subprocess-exited-with-error
  
  × Building wheel for lxml (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [128 lines of output]
      <string>:67: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
      Building lxml version 4.8.0.
      Building without Cython.
      Building against libxml2 2.9.13 and libxslt 1.1.35
      Building against libxml2/libxslt in the following directory: /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.macosx-14-arm64-cpython-311
      creating build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/_elementpath.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/sax.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/pyclasslookup.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/__init__.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/builder.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/doctestcompare.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/usedoctest.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/cssselect.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/ElementInclude.py -> build/lib.macosx-14-arm64-cpython-311/lxml
      creating build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/__init__.py -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      creating build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/soupparser.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/defs.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/_setmixin.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/clean.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/_diffcommand.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/html5parser.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/__init__.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/formfill.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/builder.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/ElementSoup.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/_html5builder.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/usedoctest.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      copying src/lxml/html/diff.py -> build/lib.macosx-14-arm64-cpython-311/lxml/html
      creating build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron
      copying src/lxml/isoschematron/__init__.py -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron
      copying src/lxml/etree.h -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/etree_api.h -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/lxml.etree.h -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/lxml.etree_api.h -> build/lib.macosx-14-arm64-cpython-311/lxml
      copying src/lxml/includes/xmlerror.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/c14n.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/xmlschema.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/__init__.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/schematron.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/tree.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/uri.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/etreepublic.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/xpath.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/htmlparser.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/xslt.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/config.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/xmlparser.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/xinclude.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/dtdvalid.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/relaxng.pxd -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/lxml-version.h -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      copying src/lxml/includes/etree_defs.h -> build/lib.macosx-14-arm64-cpython-311/lxml/includes
      creating build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources
      creating build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/rng
      copying src/lxml/isoschematron/resources/rng/iso-schematron.rng -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/rng
      creating build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl
      copying src/lxml/isoschematron/resources/xsl/XSD2Schtrn.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl
      copying src/lxml/isoschematron/resources/xsl/RNG2Schtrn.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl
      creating build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_abstract_expand.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_dsdl_include.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_skeleton_for_xslt1.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_svrl_for_xslt1.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/iso_schematron_message.xsl -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      copying src/lxml/isoschematron/resources/xsl/iso-schematron-xslt1/readme.txt -> build/lib.macosx-14-arm64-cpython-311/lxml/isoschematron/resources/xsl/iso-schematron-xslt1
      running build_ext
      building 'lxml.etree' extension
      creating build/temp.macosx-14-arm64-cpython-311
      creating build/temp.macosx-14-arm64-cpython-311/src
      creating build/temp.macosx-14-arm64-cpython-311/src/lxml
      clang -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk -DCYTHON_CLINE_IN_TRACEBACK=0 -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -Isrc -Isrc/lxml/includes -I/Users/blum/Projects/lingrex/venv/fuzzy/include -I/opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11 -c src/lxml/etree.c -o build/temp.macosx-14-arm64-cpython-311/src/lxml/etree.o -w -flat_namespace
      src/lxml/etree.c:265439:14: error: incomplete definition of type 'struct _frame'
                  f->f_back = PyThreadState_GetFrame(tstate);
                  ~^
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/pytypedefs.h:22:16: note: forward declaration of 'struct _frame'
      typedef struct _frame PyFrameObject;
                     ^
      src/lxml/etree.c:265476:19: error: incomplete definition of type 'struct _frame'
              Py_CLEAR(f->f_back);
                       ~^
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/object.h:581:44: note: expanded from macro 'Py_CLEAR'
              PyObject *_py_tmp = _PyObject_CAST(op); \
                                                 ^~
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/object.h:107:49: note: expanded from macro '_PyObject_CAST'
      #define _PyObject_CAST(op) _Py_CAST(PyObject*, (op))
                                                      ^~
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/pyport.h:24:38: note: expanded from macro '_Py_CAST'
      #define _Py_CAST(type, expr) ((type)(expr))
                                           ^~~~
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/pytypedefs.h:22:16: note: forward declaration of 'struct _frame'
      typedef struct _frame PyFrameObject;
                     ^
      src/lxml/etree.c:265476:19: error: incomplete definition of type 'struct _frame'
              Py_CLEAR(f->f_back);
                       ~^
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/object.h:583:14: note: expanded from macro 'Py_CLEAR'
                  (op) = NULL;                        \
                   ^~
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/pytypedefs.h:22:16: note: forward declaration of 'struct _frame'
      typedef struct _frame PyFrameObject;
                     ^
      src/lxml/etree.c:268244:5: error: incomplete definition of type 'struct _frame'
          __Pyx_PyFrame_SetLineNumber(py_frame, py_line);
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      src/lxml/etree.c:528:62: note: expanded from macro '__Pyx_PyFrame_SetLineNumber'
        #define __Pyx_PyFrame_SetLineNumber(frame, lineno)  (frame)->f_lineno = (lineno)
                                                            ~~~~~~~^
      /opt/homebrew/opt/[email protected]/Frameworks/Python.framework/Versions/3.11/include/python3.11/pytypedefs.h:22:16: note: forward declaration of 'struct _frame'
      typedef struct _frame PyFrameObject;
                     ^
      4 errors generated.
      Compile failed: command '/usr/bin/clang' failed with exit code 1
      creating var
      creating var/folders
      creating var/folders/tp
      creating var/folders/tp/38bwgqbn0_qgdjzjm2xkbbsm0000gp
      creating var/folders/tp/38bwgqbn0_qgdjzjm2xkbbsm0000gp/T
      cc -I/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include -I/usr/include/libxml2 -c /var/folders/tp/38bwgqbn0_qgdjzjm2xkbbsm0000gp/T/xmlXPathInitfjei9fa1.c -o var/folders/tp/38bwgqbn0_qgdjzjm2xkbbsm0000gp/T/xmlXPathInitfjei9fa1.o
      cc var/folders/tp/38bwgqbn0_qgdjzjm2xkbbsm0000gp/T/xmlXPathInitfjei9fa1.o -L/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/lib -lxml2 -o a.out
      error: command '/usr/bin/clang' failed with exit code 1
      [end of output]

The problem is described with the lxml-community here: https://bugs.launchpad.net/lxml/+bug/1969912

There are two possible solutions:

  1. Upgrade lxml to 4.9.x in the lingpy-dependencies
  2. Make clear that lingrex does not support Python3.11 on MacOS (and probably Windows)

Desegment all strings before aligning predictions

When aligning predictions, we can use a very simple method, like mult_align, I assume, as we do not expect many problems here, or we can check for same length. But since we trim alignments individually across the runs, we need to make sure that we have all predictions reflected in a similar way.

Taking advantage of manual alignments

Following the discussion in the paper repository, it might be worthwhile to check if/how we can take advantage of manual alignments provided within the datasets. This might solve a good part of the confused sounds which arise due to erroneous alignments.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.