去除后期出现的更高级的信息. 会大幅提升相似度, 作者大大能优化一些这种情况吗?
String t1 = "海南省海口市灵山镇海榆大道4号绿地城.润园海口市灵山西片去旧改项目A-32地块11#楼(栋)2(单元)2(层)203(号)";
String t2 = "海南省海口市灵山镇海榆大道4号绿地城.润园11#楼2单元203";
结果:
海南省海口市灵山镇海榆大道4号绿地城.润园海口市灵山西片去旧改项目A-32地块11#楼(栋)2(单元)2(层)203(号)
addr1 >>>> Address(
provinceId=460000000000, province=海南省,
cityId=460100000000, city=海口市,
districtId=460108000000, district=美兰区,
streetId=460108101000, street=灵山镇,
townId=460108101000, town=灵山镇,
villageId=null, village=null,
road=null,
roadNum=null,
buildingNum=A-32,
text=西片去旧改项目地块11#楼22203栋单元层号
)
>>>>>>>>>>>>>>>>>
海南省海口市灵山镇海榆大道4号绿地城.润园11#楼2单元203
addr2 >>>> Address(
provinceId=460000000000, province=海南省,
cityId=460100000000, city=海口市,
districtId=460108000000, district=美兰区,
streetId=460108101000, street=灵山镇,
townId=460108101000, town=灵山镇,
villageId=null, village=null,
road=海榆大道,
roadNum=4号,
buildingNum=11#楼2单元203,
text=绿地城润园
)
加载扩展词典:dic/region.dic
加载扩展词典:dic/community.dic
加载扩展停止词典:dic/stop.dic
相似度结果分析 >>>>>>>>> MatchedResult(
doc1=Document(terms=[Term(灵山镇), Term(A), Term(32), Term(西片), Term(去), Term(旧), Term(改), Term(项目), Term(地块), Term(11#), Term(楼), Term(22203), Term(栋), Term(单元), Term(层), Term(号)], town=Term(灵山镇), village=null, road=null, roadNum=null, roadNumValue=0),
doc2=Document(terms=[Term(灵山镇), Term(海榆大道), Term(4号), Term(11), Term(2), Term(203), Term(绿地城), Term(润园)], town=Term(灵山镇), village=null, road=Term(海榆大道), roadNum=Term(4号), roadNumValue=4),
terms=[io.patamon.geocoding.similarity.MatchedTerm@2cfb4a64],
similarity=0.4886777774252209
)
去除第二个海口市
String t1 = "海南省海口市灵山镇海榆大道4号绿地城.润园灵山西片去旧改项目A-32地块11#楼(栋)2(单元)2(层)203(号)";
String t2 = "海南省海口市灵山镇海榆大道4号绿地城.润园11#楼2单元203";
结果
海南省海口市灵山镇海榆大道4号绿地城.润园灵山西片去旧改项目A-32地块11#楼(栋)2(单元)2(层)203(号)
addr1 >>>> Address(
provinceId=460000000000, province=海南省,
cityId=460100000000, city=海口市,
districtId=460108000000, district=美兰区,
streetId=460108101000, street=灵山镇,
townId=460108101000, town=灵山镇,
villageId=null, village=null,
road=海榆大道,
roadNum=4号,
buildingNum=A-32,
text=绿地城润园灵山西片去旧改项目地块11#楼22203栋单元层号
)
>>>>>>>>>>>>>>>>>
海南省海口市灵山镇海榆大道4号绿地城.润园11#楼2单元203
addr2 >>>> Address(
provinceId=460000000000, province=海南省,
cityId=460100000000, city=海口市,
districtId=460108000000, district=美兰区,
streetId=460108101000, street=灵山镇,
townId=460108101000, town=灵山镇,
villageId=null, village=null,
road=海榆大道,
roadNum=4号,
buildingNum=11#楼2单元203,
text=绿地城润园
)
加载扩展词典:dic/region.dic
加载扩展词典:dic/community.dic
加载扩展停止词典:dic/stop.dic
相似度结果分析 >>>>>>>>> MatchedResult(
doc1=Document(terms=[Term(灵山镇), Term(海榆大道), Term(4号), Term(A), Term(32), Term(绿地城), Term(润园), Term(灵山), Term(西片), Term(去), Term(旧), Term(改), Term(项目), Term(地块), Term(11#), Term(楼), Term(22203), Term(栋), Term(单元), Term(层), Term(号)], town=Term(灵山镇), village=null, road=Term(海榆大道), roadNum=Term(4号), roadNumValue=4),
doc2=Document(terms=[Term(灵山镇), Term(海榆大道), Term(4号), Term(11), Term(2), Term(203), Term(绿地城), Term(润园)], town=Term(灵山镇), village=null, road=Term(海榆大道), roadNum=Term(4号), roadNumValue=4),
terms=[io.patamon.geocoding.similarity.MatchedTerm@4b6995df, io.patamon.geocoding.similarity.MatchedTerm@2fc14f68, io.patamon.geocoding.similarity.MatchedTerm@591f989e, io.patamon.geocoding.similarity.MatchedTerm@66048bfd, io.patamon.geocoding.similarity.MatchedTerm@61443d8f],
similarity=0.7152705001057788
)