Comments (9)
@1140325971 你好,这是正常的,因为我们设置了一个词频的阈值,高于这个阈值的词才会被考虑。谢谢!
from dkn.
Hi! Entity linking was not done by us but contained in the original dataset. The "kg.txt" is already disambiguated.
from dkn.
Thank you for your reply!
Well another question is that one word in the new title is related to one entity in KG which is described in the paper. But in raw_train.txt and raw_test.txt, there are only one or two or several entities in a news title, the number of entities is not the same as the number of words in news title. So do you just use padding operation to make them the same length or do you have other good handling methods that I have not noticed?
Thanks!
from dkn.
Yes, they are padded with zeros.
from dkn.
Thanks! I am reading your papers recently, including RippleNet, multi-task learning for recommender system and et al. Hope to have further discussion.
from dkn.
@hwwang55 Hi, Dr Wang, Thanks for sharing the code ! I am curious the entity linking method you have used, can you share some ideas ? It's seem affect the recommend result very much. Thanks very much!
from dkn.
@bifeng Hi! I'm afraid I can't help much since I'm not working on the area of entity linking. I suggest searching "entity linking survey" in Google Scholar and that might be helpful. Thanks!
from dkn.
@hwwang55 王宏伟老师您好!最近在深入研究DKN的代码,在news_process这一过程中,我发现了一个问题,就是raw_train.txt中的新闻标题的单词和train.txt中的新闻标题的编码在数量上和位置上有很多都是不一致的(实体编码亦然),下面是一些例子
0 tautog bite coming strong 0 36136:Tautog
0 bruce springsteen song magically rejected harry potter 0 331:Bruce Springsteen
0 watch tom cruise recreates iconic movie scenes james corden action packed minutes 0 3410:Tom Cruise
0 chuck says cool hall fame tupac 0 2808:Chuck D
0 big weather changes bethel dramatic change temps windy rain 0 17431:Bethel
0 1,2,3,0,0,0,0,0,0,0 0,0,0,0,0,0,0,0,0,0 0
0 4,5,6,7,8,0,0,0,0,0 2,2,0,0,0,0,0,0,0,0 0
0 9,10,11,12,13,14,15,16,17,18 0,3,3,0,0,0,0,0,0,0 0
0 21,22,23,24,25,26,0,0,0,0 4,0,0,0,0,0,0,0,0,0 0
0 27,28,29,30,31,32,0,0,0,0 0,0,0,0,0,0,0,0,0,0 0
我分别截取了这两个文件中的前五行,比如第一行,在新闻标题中有4个单词,但是编码后只有3个非0码。又比如第二行,在新闻标题中有7个单词,但是编码后只有5个非0码。这种现象占了不小的一部分,想问一下老师,这是正常现象吗?麻烦老师了!
from dkn.
@hwwang55 感谢老师,祝工作顺利!
from dkn.
Related Issues (20)
- Resource exhausted: OOM when allocating tensor HOT 4
- .vec not found HOT 10
- 两个疑惑之处 HOT 2
- 关于TransE代码的一些疑问 HOT 1
- About the kg.txt HOT 1
- TransE HOT 4
- Question about DKN/data/kg/kg_preprocess.py / HOT 1
- Question about the experimental results HOT 4
- 请问怎么对推荐结果进行验证
- 请问在训练的时候怎么样在训练集里划分出用户的history clicked news HOT 1
- 请求帮助help
- convert to tf 2.0 code HOT 1
- How can we get the complete dataset?
- All the scores are greater than 0.5 HOT 2
- 关于使用的知识图谱 HOT 2
- c++程序无法执行 HOT 1
- 代码有些问题 HOT 6
- f1值 HOT 1
- 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dkn.