Code Monkey home page Code Monkey logo

Comments (4)

fredzannarbor avatar fredzannarbor commented on June 16, 2024 1

Thank you. I did not understand how to reconstitute the dom from the constituents.

from sumy.

fredzannarbor avatar fredzannarbor commented on June 16, 2024 1

OK, one more obstacle.

 parser = PlaintextParser.from_string(text, Tokenizer("english"))
    summarizer = LexRankSummarizer()
    summarizer.stop_words = get_stop_words("english")
    print(len(parser.document.sentences))
    paragraphs = []
    drops = []
    #print(len(parser.document.paragraphs))
    for paragraph in parser.document.paragraphs:
        #print(len(paragraph.sentences))
        for sentence in paragraph.sentences:

            if "......" in str(sentence):
                drops.append(sentence)
            else:
                paragraphs.append(sentence)

    print(len(drops), len(paragraphs))
    dom = ObjectDocumentModel(paragraphs)
    print(len(dom.paragraphs))

    summary = summarizer(dom, sentences_count)

The extra code is just to make sure that the filter is dropping the problem sentences, and the keeps & drops add up correctly. But when I try to summarize the filtered dom, it throws an error.

1746
9 1737
1737
Traceback (most recent call last):
  File "app/utilities/text2sumy_summarize.py", line 53, in <module>
    result = sumy_summarize(text, sentences_count=args.sentences_count)
  File "app/utilities/text2sumy_summarize.py", line 32, in sumy_summarize
    summary = summarizer(dom, sentences_count)
  File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/summarizers/lex_rank.py", line 36, in __call__
    sentences_words = [self._to_words_set(s) for s in document.sentences]
  File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/utils.py", line 53, in decorator
    setattr(self, key, getter(self))
  File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/models/dom/_document.py", line 23, in sentences
    return tuple(chain(*sentences))
  File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/models/dom/_document.py", line 22, in <genexpr>
    sentences = (p.sentences for p in self._paragraphs)
AttributeError: 'Sentence' object has no attribute 'sentences'

from sumy.

miso-belica avatar miso-belica commented on June 16, 2024

Hello, DOM is just an object consisting of paragraphs and sentences. You can filter sentences out and create a new one if you want.

paragraphs = []
for p in parser.document.paragraphs:
   paragraphs.append([s for s in p.sentences if not str(s).contains("....")])

dom = ObjectDocumentModel(paragraphs)

You have to cover edge case as if you remove all sentences from paragraph maybe. But maybe even empty paragraphs will work.

from sumy.

miso-belica avatar miso-belica commented on June 16, 2024

The bug is on this line paragraphs.append(sentence) 😉

from sumy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.