mswon / sentimental-analysis Goto Github PK

View Code? Open in Web Editor NEW

59.0 7.0 35.0 30.7 MB

Sentimental Analysis with Naver movie ratings

Python 100.00%

sentimental-analysis's People

Stargazers

Watchers

sentimental-analysis's Issues

for문 내에서 epochs=model.iter 실행 시 에러

안녕하세요, 원민섭님
네이버영화 감정분석 영상을 보면서 따라해 보고 있는데요

for epoch in range(30):

model.train(tokens,model.corpus_count,epochs = model.iter)

for문 실행시키면
epochs 관련해서 아래와 같은 밸류 에러가 나옵니다.
ValueError: You must specify either total_examples or total_words,
for proper job parameters updationand progress calculations.
The usual value is total_examples=model.corpus_count.

어떻게 수정하면 되는지 좀 조언 부탁 드리겠습니다.
감사합니다.

tensorboard 관련 질문입니다.

다른 텍스트파일로 코드를 따라해보았는데, 원인은 알수없지만 텐서보드를 열면 그래프는 그려져있는데, 한글이 아니라 각 점들이 숫자로 태깅되어있습니다ㅠ 혹시 해결방법을 아시나요...ㅠㅠ

WORD2VEC 의 경우에 most_similar 값도 잘 프린트되는걸로 봐서 러닝이 잘 된것같은데
텐서보드 부분이 말썽입니다ㅠㅠ

import gensim
import codecs
import os
import numpy as np

doc = open(r'C:\Users\Lab01\Desktop\JYR\jupyter\new\tweet_all_word.txt', 'r')
doc1 = doc.readlines()
total_word = []
for line in doc1:
words = line.split(' ')
final_word = []
for word in words:
final_word.append(word)
total_word.append(final_word)

tokens = total_word
model = gensim.models.Word2Vec(size=300,sg = 1, alpha=0.025,min_alpha=0.025, seed=1234)
model.build_vocab(tokens)

for epoch in range(30):
model.train(tokens,model.corpus_count,epochs = model.iter)
model.alpha -= 0.002
model.min_alpha = model.alpha

model.save('Word2vec_tweet.model')

print (model.most_similar(positive=["기후변화"], topn=30))

import tensorflow as tf

max_size = len(model.wv.vocab)-1
w2v = np.zeros((max_size,model.layer1_size))

with codecs.open(r"C:\Users\Lab01\Desktop\JYR\jupyter\new\metadata2.tsv",'w+',encoding='utf-8') as file_metadata:
for i,word in enumerate(model.wv.index2word[:max_size]):
w2v[i] = model.wv[word]
file_metadata.write(word + "\n")

from tensorflow.contrib.tensorboard.plugins import projector

sess = tf.InteractiveSession()

with tf.device("/cpu:0"):
embedding = tf.Variable(w2v, trainable = False, name = 'embedding')

tf.global_variables_initializer().run()

path = 'word2vec'

saver = tf.train.Saver()
writer = tf.summary.FileWriter(path, sess.graph)

config = projector.ProjectorConfig()
embed = config.embeddings.add()
embed.tensor_name = 'embedding'
embed.metadata_path = r'C:\Users\Lab01\Desktop\JYR\jupyter\new\metadata2.tsv'

projector.visualize_embeddings(writer, config)
saver.save(sess, path + '/model.ckpt' , global_step=max_size)

학습 할 때 어떤 것에대해서 분류 할 때 0 과 1 로 나누는데

이런 기준을 여러개 둘 수 있나요?
기준1 에 대한 1, 0 값
기준2 에 대한 1, 0 값
기준3 에 대한 1, 0 값
....

현재는 0 과 1 로 트래인 값으로 학습 하는데, 부정 긍정 으로요

어떤1 가까우면 1 다르면 0

어떤2 가까우면 1 다르면 0

...

위같은 방법으로요
이런것을 여러개 기준을 둘 수 있나요?

이같은 것을 차원이라고 하나요? dimension

코드 잘봤습니다!

제가 커뮤니티에서 특정 키워드를 크롤링해서 감성분석을 진행하고 싶습니다.

선생님의 코드의 train_set을 사용해도 영화 감상평이 아닌 평소 문장도 감성분석이 가능할까요?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.