Code Monkey home page Code Monkey logo

webloganalysisbyspark's People

Contributors

springty avatar

webloganalysisbyspark's Issues

在需求二分析时,出现多个全球的坑(已解决)

在用代码对结果进行地区分析时居然得到了一个小异常,想了半天没搞明白...
可能时自己的partition by不熟悉,请求指教

val topNByRegion = regionInfo.select(
      regionInfo("day"),
      regionInfo("city"),
      regionInfo("cmsId"),
      regionInfo("times"),
      row_number().over(Window.partitionBy(regionInfo("city"))
        .orderBy(regionInfo("times").desc)).as("times_rank")
    ).filter($"times_rank" <= 3)

20161110 内蒙古自治区 1154 601 3
20161110 全球 133 414 1
20161110 全球 420 376 2
20161110 全球 267 349 3
20161110 甘肃省 214 893 1
20161110 甘肃省 144 592 2

。。。

20161110 全球 144 2073 1
20161110 全球 746 1439 2
20161110 全球 643 1182 3
20161110 江苏省 981 2060 1
20161110 江苏省 132 1536 2

全球出现了两次

在mysql中输入以下代码发现的错误

select * from videoRegion order by city,times_rank;

运行结果:
| 20161110 | 云南省 | 333 | 527 | 3 |
| 20161110 | 全球 | 133 | 414 | 1 |
| 20161110 | 全球 | 144 | 2073 | 1 |
| 20161110 | 全球 | 420 | 376 | 2 |
| 20161110 | 全球 | 746 | 1439 | 2 |
| 20161110 | 全球 | 267 | 349 | 3 |
| 20161110 | 全球 | 643 | 1182 | 3 |

刚才重新看了以下

mysql> select count(*) from videoRegion;

+----------+

| count(*) |

+----------+

| 102 |

+----------+

1 row in set (0.01 sec)

这样查询结果一共有102条

但是我又准备去掉全球数据的时候 执行了以下操作

mysql> delete from videoRegion where videoRegion.city like '全球';

Query OK, 3 rows affected (0.02 sec)

我以为眼花了,然后重新确认了一下数据:

mysql> select count(*) from videoRegion;

+----------+

| count(*) |

+----------+

| 99 |

+----------+

1 row in set (0.01 sec)

貌似结果是全球应该有空格之类的东西???

似乎是 于是为了证明我的想法 我重新运行代码写入mysql

执行刚才的语句

select * from videoRegion order by city,times_rank;

| 20161110 | 全球 | 133 | 414 | 1 |
| 20161110 | 全球 | 144 | 2073 | 1 |
| 20161110 | 全球 | 420 | 376 | 2 |
| 20161110 | 全球 | 746 | 1439 | 2 |
| 20161110 | 全球 | 267 | 349 | 3 |
| 20161110 | 全球 | 643 | 1182 | 3 |

这个结果就匪夷所思了 第一排序根据字是city 那么也就是说 两个全球如果不同

应该出现的是

city|times_rank

全球|1

全球|2

全球|3

全球|1

全球|2

全球|3

而结果居然是

全球|1

全球|1

全球|2

全球|2

全球|3

全球|3

这就很不可理解 我准备回去清洗一下,把city.replace(" ","")试一试
ok 我已经解决,刚才在某数据库软件UI界面发现 确实多一个空格,然后似乎order varchar的时候空格不会影响计算结果 ,所以才会有这种神奇操作. 感觉逻辑很清晰,舒服 自己解决
mysql> select count() from videoRegion;
+----------+
| count(
) |
+----------+
| 99 |
+----------+
1 row in set (0.01 sec)
结果正常,果然调别人的包。。。。不知道会有什么bug

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.