Comments (4)
Thank you. I did not understand how to reconstitute the dom from the constituents.
from sumy.
OK, one more obstacle.
parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LexRankSummarizer()
summarizer.stop_words = get_stop_words("english")
print(len(parser.document.sentences))
paragraphs = []
drops = []
#print(len(parser.document.paragraphs))
for paragraph in parser.document.paragraphs:
#print(len(paragraph.sentences))
for sentence in paragraph.sentences:
if "......" in str(sentence):
drops.append(sentence)
else:
paragraphs.append(sentence)
print(len(drops), len(paragraphs))
dom = ObjectDocumentModel(paragraphs)
print(len(dom.paragraphs))
summary = summarizer(dom, sentences_count)
The extra code is just to make sure that the filter is dropping the problem sentences, and the keeps & drops add up correctly. But when I try to summarize the filtered dom, it throws an error.
1746
9 1737
1737
Traceback (most recent call last):
File "app/utilities/text2sumy_summarize.py", line 53, in <module>
result = sumy_summarize(text, sentences_count=args.sentences_count)
File "app/utilities/text2sumy_summarize.py", line 32, in sumy_summarize
summary = summarizer(dom, sentences_count)
File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/summarizers/lex_rank.py", line 36, in __call__
sentences_words = [self._to_words_set(s) for s in document.sentences]
File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/utils.py", line 53, in decorator
setattr(self, key, getter(self))
File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/models/dom/_document.py", line 23, in sentences
return tuple(chain(*sentences))
File "/Users/fred/.virtualenvs/pycharmed-unity/lib/python3.8/site-packages/sumy/models/dom/_document.py", line 22, in <genexpr>
sentences = (p.sentences for p in self._paragraphs)
AttributeError: 'Sentence' object has no attribute 'sentences'
from sumy.
Hello, DOM is just an object consisting of paragraphs and sentences. You can filter sentences out and create a new one if you want.
paragraphs = []
for p in parser.document.paragraphs:
paragraphs.append([s for s in p.sentences if not str(s).contains("....")])
dom = ObjectDocumentModel(paragraphs)
You have to cover edge case as if you remove all sentences from paragraph maybe. But maybe even empty paragraphs will work.
from sumy.
The bug is on this line paragraphs.append(sentence)
😉
from sumy.
Related Issues (20)
- sumbasic: KeyError HOT 5
- Ability to access UserWarnings HOT 2
- Summarising books by verbs HOT 4
- question: how could I extract a specific number of keywords instead of sentence? HOT 2
- A HuggingFace space for sumy HOT 2
- Luhn's summarizer 'significant percentage' comment HOT 2
- power_method produces NaN, inf values HOT 1
- Is it possible to get how many texts summarized by the summarizer? HOT 7
- replace docpot with docopt-ng HOT 3
- What is the point of Docker image ? HOT 1
- wrong question HOT 1
- Console being spammed when using library. HOT 5
- PlaintextParser incompatibility with Python 3.10, easy fix HOT 2
- Prepare for NumPy v2
- Tip: how to make it summarize mid-tail languages, e.g. Polish HOT 2
- Division by zero by rouge.py, only in some algos HOT 1
- Lowercase of all languages needed in utils.py HOT 1
- Would you like to start adding type annotations to this project? HOT 1
- 包含 阿拉伯数字+字母大写 的文本内容 某些情况 摘录为空
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sumy.