Code Monkey home page Code Monkey logo

morpheus's Introduction

morpheus

Tufts University holds the overall copyright to the Perseus Digital Library; the materials therein (including all texts, translations, images, descriptions, drawings, etc.) are provided for the personal use of students, scholars, and the public.

Materials within the Perseus DL have varying copyright status: please contact the project for more information about a specific component or object. Copyright is protected by the copyright laws of the United States and the Universal Copyright Convention.

Unless otherwise indicated, all contents of this repository are licensed under a Creative Commons Attribution-ShareAlike 3.0 United States License. You must offer Perseus any modifications you make.

Morpheus parser code.

Compiling and installing morpheus

By default morpheus installs into bin/

  cd src
  make
  make install

Compiling a stem library

  cd stemlib/Latin
  export PATH=$PATH:../../bin
  MORPHLIB=.. make

Running the cruncher

MORPHLIB=stemlib bin/cruncher < wordlist > crunched

morpheus's People

Contributors

balmas avatar connormayer avatar lcerrato avatar nickjwhite avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

morpheus's Issues

[Greek] Missing parses for γράφουσι, ἀνέθηκε

In Greek, Morpheus, as verified both on Morphology Service and in Perseids / Arethusa, does not recognize γράφουσι (but cf. http://www.perseus.tufts.edu/hopper/morph?l=gra%2Ffousi&la=greek ), which should be connected with http://data.perseus.org/collections/urn:cite:perseus:grclexent.lex16845.1 , and ἀνέθηκε or ἀνέθηκα (but cf. http://www.perseus.tufts.edu/hopper/morph?l=a)ne%2Fqhke&la=greek ) - should be connected with http://data.perseus.org/collections/urn:cite:perseus:grclexent.lex88801.1 .

recipe for target 'setquant.c' failed

I have very basic programming skills, so forgive me if this is a trivial issue. When launching "make", I get this error message:

make[1]: Entering directory '/home/matteo/Downloads/morpheus-master/src/gkdict' flex -t setquant.l > setquant.c /bin/sh: 1: flex: not found <builtin>: recipe for target 'setquant.c' failed make[1]: *** [setquant.c] Error 127 rm setquant.c make[1]: Leaving directory '/home/matteo/Downloads/morpheus-master/src/gkdict' makefile:2: recipe for target 'all' failed make: *** [all] Error 2

I understand that the file "setquant.c" is missing, but I have just downloaded the folder without changing anything, so I don't know why.
Anyone knows how to solve it?
Thanks in advance

[Latin] volandi: gerund not gerundive

user report:
The word, volandi, of flying, is a gerund NOT a gerundive. When you click on the word in Perseus it shows it as a gerundive, sing, gen, neuter case. I fussed over this word for 1 hour trying to understand how it could be a gerundive. It cannot. It ends in -i which is the genitive case and it is singular, modifying a plural word, praecepta (praeceptum, -i). Because of this, is HAS to be a gerund otherwise it would be plural, volandorum, which it is not.

TEXT:

Instruit et natum “medio” que “ut limite curras,
Icare,” ait “moneo, ne, si demissior ibis,
205unda gravet pennas, si celsior, ignis adurat.
Inter utrumque vola. Nec te spectare Booten
aut Helicen iubeo strictumque Orionis ensem:
me duce carpe viam.” Pariter praecepta volandi (Line 208)
tradit et ignotas umeris accommodat alas.

some fixes for further bitrot

a.txt
As of a few years ago, the default for gcc became -fno-common. This causes the code no longer to compile. To fix this, I added the following to the main makefile:

export MORE_CFLAGS := -fcommon -Wno-implicit-int -Wno-format-overflow -Wno-implicit-function-declaration \ -Wno-return-type -Wno-format -Wno-builtin-declaration-mismatch

Then I edited each of the makefiles like this:

CFLAGS= -O2 -I${INCLUDE} ${MORE_CFLAGS}

I've attached a patch file.

Other linux users will also probably find, like me, that -lfl doesn't work unless you install a special package. For me, on linux mint, although I already had the flex package installed, it turned out that I also needed to install libfl-dev or else -lfl wouldn't work.

This still does not fix the problem with the missing ending files in stemlib/Latin and stemlib/Greek, as described here by Lutetiensis: #23

[Greek] problems with prosagwgos/on, missing forms, accentuation

There seem to be forms of this word missing from the searches and analysis tables, especially as relates to the accentuation.

input settings result
prosagwgo/s exact no results
prosagwgo/s expanded 3 results (Thuc + DH x 2)
prosagwgo/n exact 1 result (Lucian)
prosagwgo/n expanded 3 results (Thuc + DH x 2)
prosagwgo\n exact no results
prosagwgo\n expanded 3 results (Thuc + DH x 2)

Different forms of the same word not being found in search:the Lucian instance only appears when an exact search is done, but disappears when the expanded search is done.

NB: This may be a result of the indices on the site being out of sync with the actual text data rather than a problem with the tables themselves. Still worth checking.

I did a walkthrough of the problem in a pdf. Available in mantis.

sumbe/bhken not recognized

	% cruncher
	ai)lou/rou
	ai)lou/rou
	<NL>N ai)/louros  masc/fem gen sg                       os_ou</NL>
	e)/rrwsqe
	e)/rrwsqe
	<NL>V rmi  plup ind mp 2nd pl                   perfp_vow,o_stem</NL><NL>V rmi  perf imperat mp 2nd pl                  perfp_vow,o_stem</NL><NL>V rmi  perf ind mp 2nd pl                     perfp_vow,o_stem</NL>
	:longtime       0.00    e)/rrwsqe
	sumbe/bhken
	sumbe/bhken

[Latin] lepos

lepos does not return any results
Lewis and Short has lepos = lepor
(Oxford Latin Dictionary has a separate entry for lepos)

[Latin] feliciter

0001028: error in search for feliciter in Latin
Description both the current and build environments return an error on a search for this, which occurs in many texts:

the lemmatized and exact search both fail:
/hopper/searchresults?target=la&collections=Perseus%3Acollection%3AGreco-Roman&all_words=feliciter&all_words_expand=on&phrase=&any_words=&exclude_words=&search=Search

/hopper/searchresults?target=la&collections=Perseus%3Acollection%3AGreco-Roman&all_words=feliciter&phrase=&any_words=&exclude_words=&search=Search

This works when using the verb derivation apparently:

  1. search on feliciterque
    analyzed as being the verb (not the adverb?)
  2. link to collection search /hopper/searchresults?language=la&inContent=true&all_words=&all_words_expand=true&any_words=felicitare+feliciterque+felicita+felicitati+felicitatique+felicitent+felicitate+felicitatis+felicitatisque+felicitas+felicitat+felicitet+felicites+felicitateque+felicitant+feliciter+felicito&bad_words=#

This includes allowed words
felicitare feliciterque felicita felicitati felicitatique felicitent felicitate felicitatis felicitatisque felicitas felicitat felicitet felicites felicitateque felicitant feliciter felicito

vita parsing

Search on all forms of vita fails to return results for vitam et al.
Looks like data may be regularized to uita?

Parser bug

Capitalized forms of appono also get applumbo parses.

Problems with Latin supines

Hi everybody,

I think I found a problem on Morpheus interpretation of Latin supines.

The list of probable morphosyntactic properties we get when we ask for information about e.g. "pastum" (from "Phaedrus' Aesopiae 2.4.14) includes a 'noun sg supine neut nom' (encoded n-s-u-nn-) option, while "pastu" is analysed as 'noun sg supine neut dat' (encoded n-s-u-nd-).

However, despite some historical issues, supines are usually classified as "non-finite verbal forms" in most of latin grammars (including Pinkster 2015: 64), being the so-called first supine (-um) an accusative case, the second (-u) an ablative case.

So I'm wondering if it would be better to change both analysis: the first one to "verb supine accusative" (v---u--a-) and the second to "verb supine ablative" (v---u--b-).

Since the same result is observed when using the Latin Word Study Tool (Perseus Hopper) as well as Arethusa tool (Perseids platform), it seems to be systemic.

Thank you for your time and apologies for my intermediate english skills.

Best regards,
Lucas

Libraries do not compile properly

After finally getting cruncher to compile properly after much difficulty (it does not work on all systems, despite what the docs say), I then tried to compile the libraries. It seems to work fine for nouns, but when it gets to verbs everything falls apart. Here is the error I'm seeing (on a late-2013 MBP with Mojave):

cat stemsrc/vbs.latin.bas stemsrc/vbs.latin.irreg stemsrc/vbs.latin stemsrc/vbs.irreg stemsrc/vbs.mpi | \
	    perl -pe 's/([a-z])([aei])_v[ \t]+perfstem/$1\t$2vperf/g;' > \
	    conjfile
cat: stemsrc/vbs.mpi: No such file or directory
do_conj -L
make: *** [stemind/vbind] Abort trap: 6
christiansmbp2:Latin christiancasey$ 

I believe the problem is that my system lacks something that it needs, but I can't seem to figure out what it is. It could also have something to do with my workaround for the problem with Flex (see my response to another issue for details). The error message: Abort trap: 6 is wildly unhelpful.

Missing files from stemlib/Latin

The following files seem to be missing from the repository:

morpheus/stemlib/Latin/endtables/source/aLs_aris.end
morpheus/stemlib/Latin/endtables/source/abbrev.end
morpheus/stemlib/Latin/endtables/source/adverb.end
morpheus/stemlib/Latin/endtables/source/alphabetic.end
morpheus/stemlib/Latin/endtables/source/article.end
morpheus/stemlib/Latin/endtables/source/as_anis.end
morpheus/stemlib/Latin/endtables/source/conj.end
morpheus/stemlib/Latin/endtables/source/connect.end
morpheus/stemlib/Latin/endtables/source/demonstr.end
morpheus/stemlib/Latin/endtables/source/demonstr.end
morpheus/stemlib/Latin/endtables/source/es_idis.end
morpheus/stemlib/Latin/endtables/source/exclam.end
morpheus/stemlib/Latin/endtables/source/expletive.end
morpheus/stemlib/Latin/endtables/source/indecl.end
morpheus/stemlib/Latin/endtables/source/indecl_adj.end
morpheus/stemlib/Latin/endtables/source/indecl_noun.end
morpheus/stemlib/Latin/endtables/source/indef.end
morpheus/stemlib/Latin/endtables/source/indef_pron.end
morpheus/stemlib/Latin/endtables/source/indef_rel_pron.end
morpheus/stemlib/Latin/endtables/source/interrog.end
morpheus/stemlib/Latin/endtables/source/interrog.end
morpheus/stemlib/Latin/endtables/source/irreg_adj1.end
morpheus/stemlib/Latin/endtables/source/irreg_adj2.end
morpheus/stemlib/Latin/endtables/source/irreg_adj3.end
morpheus/stemlib/Latin/endtables/source/irreg_nom2.end
morpheus/stemlib/Latin/endtables/source/irreg_nom3.end
morpheus/stemlib/Latin/endtables/source/numeral.end
morpheus/stemlib/Latin/endtables/source/or_uris.end
morpheus/stemlib/Latin/endtables/source/particle.end
morpheus/stemlib/Latin/endtables/source/pers_pron.end
morpheus/stemlib/Latin/endtables/source/prep.end
morpheus/stemlib/Latin/endtables/source/pron1.end
morpheus/stemlib/Latin/endtables/source/pron2.end
morpheus/stemlib/Latin/endtables/source/pron3.end
morpheus/stemlib/Latin/endtables/source/pron_adj1.end
morpheus/stemlib/Latin/endtables/source/pron_adj3.end
morpheus/stemlib/Latin/endtables/source/reflex.end
morpheus/stemlib/Latin/endtables/source/rel_pron.end
morpheus/stemlib/Latin/endtables/source/relative.end
morpheus/stemlib/Latin/endtables/source/s_dis_adj.end

and

morpheus/stemlib/Latin/stemsrc/vbs.mpi

I could not find them in the whole history of this repository and they don't seem to be autogenerated. Would anyone have them? This is would also fix @christiancasey's #18.

@lcerrato (sorry to ping you directly), this repository seems to be abandoned, are you looking for contributors? I am very seriously willing to fix bugs, modernize the repository, update the code... Let me know.

Thanks.

Morpheus parsing llt-tokenized texts

People can request to markup enclicits (the Pisa guys who'd like to annotate some Seneca already did), in most cases this comes down to a hyphen, e.g. arma virumque becomes arma virum -que. I think this makes a lot of sense, especially with other cases, such as an enclitic ne, because without marking them as split up, there would be no chance to disambiguate an enclitic -ne from the 'real' ne (used for negations etc.)

However, Morpheus doesn't really know what to do with the hyphen - -c and -que remain unidentified entirely, -ne is said to be a form of neo1 etc.

[Latin] audaci: ablative, not dative

user report:
The word in line 223, book VIII Ovid, Metamorphoses, is not dative, but an ablative, adjective, singular. It modifies volatu in the same line which is ablative because of gaudere, which takes ablative.

dextra Lebinthos erat fecundaque melle Calymne,
cum puer audaci coepit gaudere volatu (line 223)
deseruitque ducem caelique cupidine tractus
225
altius egit iter. Rapidi vicinia solis

suggested new lemmas

adjective reses, residis (in Lewis Short, not in inventory); saetiger; turriger; corniger
noun accipiter (usually parsed as verb accipitro), cos, cotis whetstone
proper names: Eprius, Vinius, Haterius, Pharasmanes, Hordeonius, Gotarzes, Menestheus
add verb redimio (is not = redimo), crebresco, inardesco
delete verb re-imito (combined in the cruncher? does not exist)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.