Code Monkey home page Code Monkey logo

vocab-mashup's Introduction

Vocabulary Mashup

A cheating pseudo-entry in NaNoGenMo 2015.

This code mashes up two source texts. The first text is used for structure, while the second provides the vocabulary. Word replacements are chosen to be semantically close (using word2vec and part-of-speech identification) as well as similar in frequency between the texts.

pip3 install click, gensim
python3 mashup.py mash input/alices.txt input/bible.txt

**Gender mashup python mashup.py gender input/alices.txt
Computing frequencies for input/alices.txt
Loading POS tags
Loaded
guinea [('m', 0.0014285714285714286), ('f', 0.0006184291898577613), ('n', 0.0022189349112426036)]
Cheshire [('m', 0.0), ('f', 0.0018552875695732839), ('n', 0.0029585798816568047)]
William [('m', 0.005714285714285714), ('f', 0.0012368583797155227), ('n', 0.0014792899408284023)]
whiting [('m', 0.004285714285714286), ('f', 0.0006184291898577613), ('n', 0.0014792899408284023)]
brown [('m', 0.0014285714285714286), ('f', 0.0006184291898577613), ('n', 0.005177514792899409)]
cook [('m', 0.005714285714285714), ('f', 0.0055658627087198514), ('n', 0.0007396449704142012)]
lily [('m', 0.0), ('f', 0.004947433518862091), ('n', 0.005177514792899409)]
Kitty [('m', 0.0014285714285714286), ('f', 0.004329004329004329), ('n', 0.011834319526627219)]
March [('m', 0.01), ('f', 0.0024737167594310453), ('n', 0.013313609467455622)]
Humpty [('m', 0.017142857142857144), ('f', 0.0074211502782931356), ('n', 0.016272189349112426)]
Dumpty [('m', 0.017142857142857144), ('f', 0.0074211502782931356), ('n', 0.016272189349112426)]
Alice [('m', 0.14), ('f', 0.22077922077922077), ('n', 0.1952662721893491)]

vocab-mashup's People

Contributors

mewo2 avatar

Watchers

 avatar  avatar

vocab-mashup's Issues

fyi - south park + ml presidential mashup

https://raw.githubusercontent.com/dmdouglass/SouthParkDialogue/master/All-seasons.csv
https://github.com/johndpope/speech-dl

https://gist.github.com/johndpope/f235f213ef5a0d47729cb8f57f966192

python mashup.py mash interview-vice-president-kktx-radio-1360 southpark

Computing frequencies for interview-vice-president-kktx-radio-1360
Computing frequencies for southpark
Grouping POS
Loading POS tags
Loaded
Matching vocabulary
0
Loading backup w2v
/Users/johnpope/miniconda3/envs/tensorflow/lib/python3.6/site-packages/gensim/matutils.py:737: FutureWarning: Conversion of the second argument of issubdtype from int to np.signedinteger is deprecated. In future, it will be treated as np.int64 == np.dtype(int).type.
if np.issubdtype(vec.dtype, np.int):
president time
vice dude
people people
thank think
way way
texas kyle
local right
resources guys
morning cartman
support hell
officials boys
continue want
storm school
frank god
station park
radio man
fema alright
hurricane mom
harvey stan
state boy
guard dad
national good
message son
federal sure
government kid
corpus love
christi kenny
affected supposed
prayers kids
begin look
say say
families children
mike eric
pence world
time day
provide take
place money
southeast sorry
harm work
recovery ass
flooding trying
need need
financial new
available great
water town
important stupid
emergency guy
commitment friend
assure tell
houston jesus
area lot
congress jesus
work life
community kind
edt night
white cool
house house
done done
information care
enduring making
landfall chef
incredible nice
responders butters
coast name
public big
lives friends
entire real
lady family
week year
necessary bad
efforts thanks
rescue fun
long long
disaster problem
scale stuff
american gay
governor son
region game
sound cause
voice idea
assistance ass
services parents
stations things
listen stop
shelter kid
ground head
unprecedented cool
response fun
please please

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.