To perform automatic text summarization using Natural Language Processing (NLP) techniques.
Algorithm:
Step 1 Import necessary libraries for natural language processing tasks.<BR>
Step 2: Download NLTK resources, including the punkt tokenizer and stopwords.<BR>
Step 3: Define Text Preprocessing Function to tokenize, remove stopwords, and perform stemming.<BR>
Step 4: Define the Text Summarization Function using a simple frequency-based approach.<br>
- Calculate the frequency of each word in the preprocessed text.<br>
- Calculate a score for each sentence based on the sum of word frequencies.<br>
- Select the top N sentences with the highest scores to form the summary.<br>
Step 5: Construct the main program to read the paragraph and perform text summarization<br>
- Generate and print the original text.<br>
- Generate and print the text summary using the Text Summarization function<br>
Program:
DEVELOPEDBY: SriVarshanPREGISTERNUMBER: 212222240104
!pipinstallnltkimportnltkfromnltk.corpusimportstopwordsfromnltk.tokenizeimportword_tokenize,sent_tokenizefromnltk.stemimportPorterStemmernltk.download( 'punkt' )
nltk.download( 'stopwords' )
defpreprocess_text(text):
# Tokenize the text into wordswords=word_tokenize(text)
# Remove stopwords and punctuationstop_words=set(stopwords.words( 'english'))
filtered_words= [wordforwordinwordsifword.lower() notinstop_wordsandword.isalnum()]
# Stemmingstemmer=PorterStemmer()
stemmed_words= [stemmer. stem(word) forwordinfiltered_words]
returnstemmed_words
defgenerate_summary(text,num_sentences=3):
sentences=sent_tokenize(text)
preprocessed_text=preprocess_text(text)
# Calculate the frequency of each wordword_frequencies=nltk. FreqDist (preprocessed_text)
# Calculate the score for each sentence based on word frequencysentence_scores={}
forsentenceinsentences:
forword, freqinword_frequencies.items():
ifwordinsentence.lower():
ifsentencenotinsentence_scores:
sentence_scores[sentence] =freqelse:
sentence_scores[sentence]+=freq# Select top N sentences with highest scoressummary_sentences=sorted(sentence_scores, key=sentence_scores.get,reverse=True)[:num_sentences]
return' '. join(summary_sentences)
if__name__=="__main__":
input_text=""" Natural language processing (NLP) is a subfield of artificial intelligence. It involves the development of algorithms and models that enact NLP. NLP is used in various applications, including chatbots, language Understanding, and language generation. This program demonstrates a simple text summarization using NLP"""summary=generate_summary(input_text)
print("Origina1 Text: ")
print (input_text )
print( " \nSummary : " )
print(summary)
Output
Result:
Thus, the program to perform the Text summarization is executed sucessfully.