Code Monkey home page Code Monkey logo

blackout's People

Contributors

mkremins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

blackout's Issues

Extracting multiple sentences on a page

Some of the articles are really long, it would be interesting to see blackout creating a multi-line poem based on a 500+ word-based article (or better yet, a 2000+ word standard article)

Bookmarklet crashes hard with strict CSPs

(ChromeOS 72.0.3626.122)

Running the bookmarklet as-is on a site with strict CSP, such as https://mastodon.social/users/Teryl_Pacieco/statuses/101709566428499937 , generates the following error:

Refused to load the script 'https://mkremins.github.io/blackout/bundle.js?431181' because it
violates the following Content Security Policy directive: "script-src 'self' https://THE.WEBSITE".
Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.

Trying to fetch() the source and add that, as the script itself doesn't load any new scripts:

Refused to connect to 'https://mkremins.github.io/blackout/bundle.js' because it violates the
following Content Security Policy directive: "connect-src 'self' blob: wss://THE.WEBSITE
https://THE.WEBSITE a.THE.WEBSITE".

So this is a pretty airtight CSP, and if there was a way to get around it from JS, browsers would fix the bug as soon as they found out.

Good news: chrome-extension:// bypasses CSP.
Bad news: Firefox doesn't use stable extension IDs, so the extension itself needs to inject the script (e.g. with a click action) instead of being able to use a bookmarklet.

Use heuristics when choosing which of several matches to use

We currently poemify a block of text by running several matchers over it in parallel, then randomly choosing one successful match to use from all the matches that succeeded. I suspect it’s possible to produce better results by using something other than a purely random draw when selecting which of several matches to use.

Some hypothesized heuristics that might produce better results:

  • Prefer longer matches (i.e. matches containing more words) over shorter matches.
  • Prefer matches containing fewer pronouns.
  • Prefer matches containing fewer repeated words.
  • Parallelism: prefer matches that are similar to previously selected matches in grammatical structure and/or word choice. (This one in particular might do a lot to make entire pages of generated poetry feel more coherent, because it would encourage repetition of a few key words and/or sentence structures throughout the page.)

Maybe we could assign each match a score based on these (or similar) heuristics; sort the list of successful matches by score; and then perform a semi-random selection that’s biased towards the front of the sorted list using something like biased-rand-nth.

Preserve inline-level formatting when rewriting text

When rewriting a block of text into a chunk of poetry, we currently disregard inline-level formatting (such as italics and bolded text) entirely. Human-generated blackout poetry, on the other hand, often makes use of formatting in the underlying text to interesting aesthetic effect.

Perhaps we could take steps to preserve inline-level text formatting in the generated poetry by keeping track of where inline-level HTML tags (such as <em> and <strong>) open and close in the source text, then reinserting these tags in the proper places when writing out the poemified HTML.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.