Code Monkey home page Code Monkey logo

Comments (11)

LinguList avatar LinguList commented on July 21, 2024

here's the code to run it:

from lingpy import *
from glob import glob

files = ['ABVD_full.txt', 'IELex-2016.tsv']
for f in files[::-1]:
    print(f)
    lex = LexStat('data/'+f, check=True, transcription='transcription')
    lex.cluster(method='turchin', ref='cogs')
    lex.output('paps.nex', filename=f, ref='cogs', missing='?')

but I just checked: this will yield some errors, both in ielex and abvd, not many, but still, so it may be useful to check that or have it checked.

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

here's the nexus for IELex:

IELex-2016.tsv.paps.nex.txt

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

@PhyloStar, tell me, how you prefer to handle this: do you want me to submit simple scripts, which take care of things, so you call them from command line, or rather just point to code solutions and have you do the lifting, so you can see that it is a more integrated framework?

from autocogphylo.

PhyloStar avatar PhyloStar commented on July 21, 2024

@LinguList Simple scripts might be sufficient. Running LexStat on the whole of ABVD might take sometime.

I am using ASJP sound classes for running LDN and PMI. Do you think we need to stick to SCA or ASJP or test with both?

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

test with both or only with one should not make a big difference, I'd say...

from autocogphylo.

PhyloStar avatar PhyloStar commented on July 21, 2024

Okay. Lets say I test LDN and PMI with ASJP. Doing with ASJP is part of tradition. :) Whereas, SCA+LexStat/Turchin is part of tradition also. :)

from autocogphylo.

PhyloStar avatar PhyloStar commented on July 21, 2024

I get a memory error when running Turchin on the ABVD full dataset.

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

Here's hou you compute turchin:

dolgos = {}
wl = Wordlist('ABVDxxx.tsv')
for idx, segments in iter_rows(wl, 'tokens'):
    dolgo = tokens2class(tokens, 'dolgo') + ['H']
    if dolgo[0] == 'V':
        dolgo = ['H'] + dolgo
    dstring = ''.join([d for d in dolgo if d != 'V'])[:2]
    dolgos[idx] = dstring+'-'+wl[k, 'concept']
wl.add_entries('cog', dolgos, lambda x: dolgos[x])
wl.renumber('cog', 'turchinid')
wl.output('paps.nex', ref='turchinid', missing='?')

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

The basic idea is: just convert all segments to the two-letter thing in dolgopolsky-strings, and then use lingpy's functions for renumbering of cognate sets to convert the things to numbers. Much faster and easier than envoking the lexstat model.

from autocogphylo.

LinguList avatar LinguList commented on July 21, 2024

jusut added the code "turchin.py", note that this is much faster and closer to the original description

from autocogphylo.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.