Code Monkey home page Code Monkey logo

zombiewriter's People

Contributors

gitter-badger avatar tra38 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zombiewriter's Issues

Avoid Auto-Rebuild of LSI Indexes

ClassifierReborn::Summarizer.perform_lsi uses a neat trick to avoid LSI's default behavior of "auto-rebuilding" the index after every time a person adds new content .

def perform_lsi(chunks, count, separator)
    lsi = ClassifierReborn::LSI.new auto_rebuild: false
    chunks.each { \|chunk\| lsi << chunk unless chunk.strip.empty? \|\| chunk.strip.split.size == 1 }
    lsi.build_index
    summaries = lsi.highest_relative_content count
    summaries.reject { \|chunk\| !summaries.include? chunk }.map(&:strip).join(separator)
end

This could be useful behavior to have.

Program crashes for larger quantities of articles

I've been using ZombieWriter and finding that it hits the same crash in Classifier-Reborn when I have a larger quantity of rows in the CSV file:

Jacks-MacBook-Pro:Projects johncambou$ ruby review-generator.rb /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi/content_node.rb:30:in transposed_search_vector': undefined method col' for nil:NilClass (NoMethodError) from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi.rb:190:in block in proximity_array_for_content'
from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi.rb:188:in collect' from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi.rb:188:in proximity_array_for_content'
from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi.rb:166:in block in highest_relative_content' from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi.rb:166:in each_key'
from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi.rb:166:in highest_relative_content' from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi/summarizer.rb:29:in perform_lsi'
from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/classifier-reborn-2.2.0/lib/classifier-reborn/lsi/summarizer.rb:10:in summary' from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:21:in header'
from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:69:in block in generate_articles' from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:57:in map'
from /Users/johncambou/.rbenv/versions/2.4.2/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:57:in generate_articles' from review-generator.rb:12:in

'`

What's really strange to me is that this only happens for larger quantities of articles. When I have only ~40 or less rows in the CSV, it runs fine, but as I get to ~50+, the program will always hit the crash.

What's even stranger is that this doesn't seem to be consistent - sometimes it will crash at only 35 CSV lines, or sometimes it runs successfully at 56. Sometimes it will crash at the exact same CSV file that it was correctly processing earlier.

I've very meticulously tested if this is being caused by the specific content of my articles, but the program runs fine for any subset of my articles - it only crashes when I get above this certain general limit in quantity.

At this point I have tried:

  • Ensuring that every line has 2 sentences
  • Tried each line only having the content, and also tried with full sourcetext and sourceURL
  • Swapped out different article content

I'm completely lost. Ideally I'd like to run the program with 300+ paragraphs, so that I can really get crazy with the output, but it's disappointing to be capped at so few. If you have any suggestions on how to fix this it'd be greatly appreciated.

Zombiewriter crashing with an error in Classifier Reborn

Hi. I was trying to use your ZombieWriter software in Windows, but unfortunately is not working.

require 'zombie_writer'
zombie = ZombieWriter::Randomization.new

zombie.add_string(content: "This is filler text that I invented.This is also a paragraph that could be used")
zombie.add_string(content: "This post is amazing. Please take a look")
zombie.add_string(content: "For all sports fan, you must watch this video. Hey you have to check this out.")


array = zombie.generate_articles


File.open("e:/temp/articles.md", "w+") do |f|
  array.each { |article| f.puts("#{article}\n- - -\n\n") }
end

I always get the following error:

C:/Ruby23/lib/ruby/gems/2.3.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:309:in sort': comparison of Float with NaN failed (ArgumentError) from C:/Ruby23/lib/ruby/gems/2.3.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:309:in build_reduced_matrix'
from C:/Ruby23/lib/ruby/gems/2.3.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi.rb:143:in build_index' from C:/Ruby23/lib/ruby/gems/2.3.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi/summarizer.rb:28:in perform_lsi'
from C:/Ruby23/lib/ruby/gems/2.3.0/gems/classifier-reborn-2.1.0/lib/classifier-reborn/lsi/summarizer.rb:10:in summary' from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:21:in header'
from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:126:in block in generate_articles' from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:116:in each'
from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:116:in each_slice' from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:116:in with_index'
from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:116:in each' from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:116:in map'
from C:/Ruby23/lib/ruby/gems/2.3.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:116:in generate_articles' from test.rb:21:in

'

What do you think can be the problem? Thanks so much if you can take a look.

Trying to regenerate nanogenmo article results

Hi Tariq. My dad found this article you wrote and asked me if I can teach him how to use ZombieWriter. I tried to recreate similar articles using nanogenmo.csv and the ruby script you had in the article. I am currently trying again right now on an EC2 GPU instance and it has been at 100% CPU for the last 20 minutes so I think it is hanging.

Can you help me figure out what I am missing?

Here are my notes if that helps. I'm writing them for my dad.

Thanks in advance!

Aaron

"comparison of Float with NaN failed"...and GSL is Installed

While trying to fix an unrelated issue, I experimented with the code from #5, but using ZombieWriter::MachineLearning rather than ZombieWriter::Randomization.

zombie = ZombieWriter::MachineLearning.new

zombie.add_string(content: "This is filler text that I invented.This is also a paragraph that could be used")
zombie.add_string(content: "This post is amazing. Please take a look")
zombie.add_string(content: "For all sports fan, you must watch this video. Hey you have to check this out.")

array = zombie.generate_articles

p array

#/Users/tariqali/.rbenv/versions/2.4.0/lib/ruby/gems/2.4.0/gems/kmeans-clusterer-0.11.4/lib/kmeans-clusterer.rb:237:in `sort_by': comparison of Float with NaN failed (ArgumentError)

The culprit is the third string. Classifier-Reborn classified its lsi_norm as a vector of NaNs...

 "For all sports fan, you must watch this video. Hey you have to check this out.\n"=>
  #<ClassifierReborn::ContentNode:0x007fdec4b25ae8
   @categories=[],
   @lsi_norm=GSL::Vector
[   nan   nan   nan   nan   nan   nan   nan ... ],
   @lsi_vector=GSL::Vector
[ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 ... ],
   @raw_norm=GSL::Vector
[ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 ... ],
   @raw_vector=GSL::Vector
[ 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 ... ],
   @word_hash={:for=>1, :sport=>1, :fan=>1, :must=>1, :watch=>1, :video=>1, :hei=>1, :check=>1, :out=>1}>}

Changing the third string slightly resolves the issue.

zombie = ZombieWriter::MachineLearning.new

zombie.add_string(content: "This is filler text that I invented.This is also a paragraph that could be used")
zombie.add_string(content: "This post is amazing. Please take a look")
zombie.add_string(content: "For all sports fan, you must watch this video. Hey you have to check this out. Filler, filler, filler.")

array = zombie.generate_articles

p array
 "For all sports fan, you must watch this video. Hey you have to check this out. Filler, filler, filler.\n"=>
  #<ClassifierReborn::ContentNode:0x007fd931432fd0
   @categories=[],
   @lsi_norm=GSL::Vector
[ 6.205e-01 1.432e-01 1.432e-01 1.432e-01 1.432e-01 1.432e-01 0.000e+00 ... ],
   @lsi_vector=GSL::Vector
[ 6.593e-01 1.522e-01 1.522e-01 1.522e-01 1.522e-01 1.522e-01 0.000e+00 ... ],
   @raw_norm=GSL::Vector
[ 5.547e-01 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 ... ],
   @raw_vector=GSL::Vector
[ 6.272e-01 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 ... ],
   @word_hash={:for=>1, :sport=>1, :fan=>1, :must=>1, :watch=>1, :video=>1, :hei=>1, :check=>1, :out=>1, :filler=>3}>}

But why? Both scenarios appeared to have a @word_hash, so it isn't quite clear why one string had a vector of NaNs and the other one doesn't. Is it because in the second scenario, the third string had words that were similar to that of the first string? I will have to research this issue more carefully and decide how to gracefully handle this potential error.

This problem is probably not likely to happen in the real-world...if you add long passages to ZombieWriter, there's bound to be a few overlaps of words that classifier-reborn can detect. But it could happen...which is why I need to figure out how to fix it.

Need automated tests

I engaged in some "technical debt" when building ZombieWriter, and now is the time to pay it back. If ZombieWriter is to be maintainable, I will need to write some automated tests (ideally using RSpec).

narray_ext.rb:21:in `new': Argument required (ArgumentError)

Hello.
I have unfortunately not been able to get this to run:
I keep getting the error:
narray_ext.rb:21:in 'new': Argument required (ArgumentError)
I realize that the error is from narray but I believe that the gem should work out of the box, following the instructions in the Readme? I have installed GSL and the DevKit following the instructions from their page.

This is the program that generates the error:

require 'zombie_writer'

zombie = ZombieWriter::MachineLearning.new

zombie.add_string(content: "Lorem ipsum dolor sit amet.",
sourcetext: "Cicero's Great Speech On Ethics",
sourceurl: "http://example.com/lorem-ipsum")

array = zombie.generate_articles

And this is the stack trace:

Uncaught exception: Argument required
	/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/narray-0.6.1.2/narray/narray_ext.rb:21:in `new'
	/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/narray-0.6.1.2/narray/narray_ext.rb:21:in `cast'
	/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/kmeans-clusterer-0.11.4/lib/kmeans-clusterer.rb:13:in `ensure_matrix'
	/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/kmeans-clusterer-0.11.4/lib/kmeans-clusterer.rb:130:in `run'
	/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:80:in `generate_clusters'
	/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:56:in `generate_articles'
	/home/mikael/Documents/source/Zombie/main.rb:9:in `<top (required)>'
	/home/mikael/.rbenv/versions/2.4.1/bin/rdebug-ide:23:in `load'
	/home/mikael/.rbenv/versions/2.4.1/bin/rdebug-ide:23:in `<main>'
/home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/narray-0.6.1.2/narray/narray_ext.rb:21:in `new': Argument required (ArgumentError)
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/narray-0.6.1.2/narray/narray_ext.rb:21:in `cast'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/kmeans-clusterer-0.11.4/lib/kmeans-clusterer.rb:13:in `ensure_matrix'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/kmeans-clusterer-0.11.4/lib/kmeans-clusterer.rb:130:in `run'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:80:in `generate_clusters'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/zombie_writer-0.2.0/lib/zombie_writer.rb:56:in `generate_articles'
	from /home/mikael/Documents/source/Zombie/main.rb:9:in `<top (required)>'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/ruby-debug-ide-0.6.0/lib/ruby-debug-ide.rb:88:in `debug_load'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/ruby-debug-ide-0.6.0/lib/ruby-debug-ide.rb:88:in `debug_program'
	from /home/mikael/.rbenv/versions/2.4.1/lib/ruby/gems/2.4.0/gems/ruby-debug-ide-0.6.0/bin/rdebug-ide:130:in `<top (required)>'
	from /home/mikael/.rbenv/versions/2.4.1/bin/rdebug-ide:23:in `load'
	from /home/mikael/.rbenv/versions/2.4.1/bin/rdebug-ide:23:in `<main>'

Hope you can help me pinpoint what has gone wrong here? Keep up the good work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.