A mapreduce work to count different columns.
One line of log =>
-
Key-
author
orbvid
oruser_location
-
Value- (
type
, 1)
type map
key | type |
---|---|
author | 1 |
bvid | 2 |
user_location | 3 |
log sample
15:45:52.111 [http-nio-8080-exec-2] INFO niit.start.util.LogGenerator -
author
:bvid
fromuser_location
Mapper (K,V) =>
- Key: K - key of mapper(
author
orbvid
oruser_location
) - Value:
type
+ "\t" +sum
+ "\t" +date
type - type
of Mapper V
sum - count Mapper V, using type
to distinguish
date - process date
Used to have topN, but I think the data is needed for analysis
related project
the results are stored in mysql
use ssh2 to execute mapreduce on linux
visualization and front end