Comments (35)
offtopic: Solo Leveling
is fantastic, I am putting it up there almost close to Jujutsu Kaisenn
from vince.
Hi, sorry for the silence.
I have been busy looking for work and at the same time working on a better index storage.
So, be a bit patient, I plan to use the new storage https://github.com/gernest/rbf on vince
Basically
- I will remove distributed stuff from vince: They make code to be complex and tough finding and debugging
- Migrate to roaring bitmap indexes ( saves a lot of space and compute)
Since we store raw events in vince, don't worry about losing data I will provide a simple command to migrate existing data to new store.
As a user everything will still work without interruption, and much better the timestamp related issues will go away since the storage uses quantum indexes.
I'm doing this to help me with maintenance. I am using the new storage with https://github.com/gernest/requiemdb , when I'm happy will move to vince.
Rest assured you will be the first to be notified when it is ready.
from vince.
Possible to have the option to select the one we want? But the 3 groups can be a good option to make some stats
from vince.
Thanks for reporting, this is definitely a bug, I will investigate and fix tomorrow.
from vince.
@Ziedelth sorry I couldn't look into this today, I'm not feeling well.
from vince.
No problem, take care.
from vince.
Hi, so I looked into this. You are right, events are stored with timestamps that have local timezone, but we process computed time as UTC
.
We have two options
- use
UTC
everywhere, this is a safer and correct way. It is a breaking change though and all the data you collected so far will be lost. - convert each time to UTC before doing any computation. This will be expensive and slow, and potentially error prone.
what do you think?
from vince.
xD
from vince.
Hi, so I looked into this. You are right, events are stored with timestamps that have local timezone, but we process computed time as
UTC
.We have two options
* use `UTC` everywhere, this is a safer and correct way. It is a breaking change though and all the data you collected so far will be lost. * convert each time to UTC before doing any computation. This will be expensive and slow, and potentially error prone.
what do you think?
Can't you convert non-UTC dates at server startup to a standard format without data loss?
It's true that it would be better to use UTC dates to conform to all countries...
from vince.
. Can't you convert non-UTC dates at server startup to a standard format without data loss?
We can do that. Let me think of something.
Are other api endpoints working as intended ? I want to be sure that only timeseries endpoint is affected.
from vince.
I think so, it's the only one, I don't have any problems with sources or pages
from vince.
Awesome, I will think of something tomorrow.
from vince.
I don't know if this helps, but even though I'm on the same day (10am in France, 8am UTC), I still get the date duplication... Maybe it's not also the timezone?
{
"results": [
{
"timestamp": "2024-03-20T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 14,
"views_per_visit": 1.5555555555555556,
"visit_duration": 174.63133333333332,
"visitors": 7,
"visits": 9
}
},
{
"timestamp": "2024-03-19T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 15,
"views_per_visit": 1.5,
"visit_duration": 345.8795,
"visitors": 4,
"visits": 10
}
},
{
"timestamp": "2024-03-18T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 25,
"views_per_visit": 1.7857142857142858,
"visit_duration": 65.617,
"visitors": 11,
"visits": 14
}
},
{
"timestamp": "2024-03-17T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 39,
"views_per_visit": 1.8571428571428572,
"visit_duration": 157.9424285714286,
"visitors": 15,
"visits": 21
}
},
{
"timestamp": "2024-03-16T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 38,
"views_per_visit": 1.9,
"visit_duration": 211.16555,
"visitors": 6,
"visits": 20
}
},
{
"timestamp": "2024-03-15T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 23,
"views_per_visit": 1.6428571428571428,
"visit_duration": 226.07614285714286,
"visitors": 7,
"visits": 14
}
},
{
"timestamp": "2024-03-14T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 23,
"views_per_visit": 1.6428571428571428,
"visit_duration": 129.684,
"visitors": 9,
"visits": 14
}
},
{
"timestamp": "2024-03-13T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 45,
"views_per_visit": 1.8,
"visit_duration": 93.99807999999999,
"visitors": 7,
"visits": 25
}
},
{
"timestamp": "2024-03-14T00:00:00Z",
"values": {
"bounce_rate": 0,
"pageviews": 1,
"views_per_visit": 0,
"visit_duration": 0,
"visitors": 1,
"visits": 0
}
},
{
"timestamp": "2024-03-16T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 5,
"views_per_visit": 5,
"visit_duration": 817.611,
"visitors": 2,
"visits": 1
}
},
{
"timestamp": "2024-03-17T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 12,
"views_per_visit": 1.7142857142857142,
"visit_duration": 9.038,
"visitors": 4,
"visits": 7
}
},
{
"timestamp": "2024-03-18T00:00:00Z",
"values": {
"bounce_rate": 0,
"pageviews": 2,
"views_per_visit": 0,
"visit_duration": 0,
"visitors": 2,
"visits": 0
}
},
{
"timestamp": "2024-03-19T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 4,
"views_per_visit": 4,
"visit_duration": 236.214,
"visitors": 2,
"visits": 1
}
},
{
"timestamp": "2024-03-20T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 9,
"views_per_visit": 1.8,
"visit_duration": 4.5288,
"visitors": 2,
"visits": 5
}
}
]
}
from vince.
Interesting, this is helpful indeed. It looks like our bucketing implementation is buggy. This narrows down where I should focus on.
from vince.
Hi, I updated vince to use local time in everything. Not sure if it solves this problem but it is a step in the right direction on how we handle time.
Please upgrade your container and give it a try.
from vince.
It looks good for now
from vince.
It looks good for now
after the upgrade?
from vince.
Yes, I will wait for tomorrow to test between the timezone
from vince.
The problem seems to have happened again. My old data is fine on a single date, but it's still happening on today's dates.
{
"results": [
{
"timestamp": "2024-03-21T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 4,
"views_per_visit": 4,
"visit_duration": 59.323,
"visitors": 1,
"visits": 1
}
},
{
"timestamp": "2024-03-13T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 45,
"views_per_visit": 1.0714285714285714,
"visit_duration": 0,
"visitors": 7,
"visits": 42
}
},
{
"timestamp": "2024-03-14T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 24,
"views_per_visit": 1.1428571428571428,
"visit_duration": 0,
"visitors": 9,
"visits": 21
}
},
{
"timestamp": "2024-03-15T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 23,
"views_per_visit": 1.15,
"visit_duration": 9678.446899999999,
"visitors": 7,
"visits": 20
}
},
{
"timestamp": "2024-03-16T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 43,
"views_per_visit": 1.075,
"visit_duration": 2300.2100999999993,
"visitors": 7,
"visits": 40
}
},
{
"timestamp": "2024-03-17T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 51,
"views_per_visit": 1.0625,
"visit_duration": 4612.806375,
"visitors": 17,
"visits": 48
}
},
{
"timestamp": "2024-03-18T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 27,
"views_per_visit": 1.125,
"visit_duration": 18510.507708333334,
"visitors": 12,
"visits": 24
}
},
{
"timestamp": "2024-03-19T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 19,
"views_per_visit": 1.1875,
"visit_duration": 11121.47675,
"visitors": 4,
"visits": 16
}
},
{
"timestamp": "2024-03-20T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 46,
"views_per_visit": 1.069767441860465,
"visit_duration": 6586.031023255815,
"visitors": 9,
"visits": 43
}
},
{
"timestamp": "2024-03-21T00:00:00Z",
"values": {
"bounce_rate": 1,
"pageviews": 6,
"views_per_visit": 2,
"visit_duration": 377.60966666666667,
"visitors": 2,
"visits": 3
}
}
]
}
from vince.
By the way, is it normal for the json returned by endpoints to be "beautiful"? Minimizing it can improve performance and speed.
from vince.
By the way, is it normal for the json returned by endpoints to be "beautiful"? Minimizing it can improve performance and speed.
Can you open an issue to ask for this ? It is beautified by accident(I forgot to remove this), an issue helps with tracking changes,
from vince.
No problem
from vince.
The problem seems to have happened again. My old data is fine on a single date, but it's still happening on today's dates.
{ "results": [ { "timestamp": "2024-03-21T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 4, "views_per_visit": 4, "visit_duration": 59.323, "visitors": 1, "visits": 1 } }, { "timestamp": "2024-03-13T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 45, "views_per_visit": 1.0714285714285714, "visit_duration": 0, "visitors": 7, "visits": 42 } }, { "timestamp": "2024-03-14T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 24, "views_per_visit": 1.1428571428571428, "visit_duration": 0, "visitors": 9, "visits": 21 } }, { "timestamp": "2024-03-15T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 23, "views_per_visit": 1.15, "visit_duration": 9678.446899999999, "visitors": 7, "visits": 20 } }, { "timestamp": "2024-03-16T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 43, "views_per_visit": 1.075, "visit_duration": 2300.2100999999993, "visitors": 7, "visits": 40 } }, { "timestamp": "2024-03-17T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 51, "views_per_visit": 1.0625, "visit_duration": 4612.806375, "visitors": 17, "visits": 48 } }, { "timestamp": "2024-03-18T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 27, "views_per_visit": 1.125, "visit_duration": 18510.507708333334, "visitors": 12, "visits": 24 } }, { "timestamp": "2024-03-19T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 19, "views_per_visit": 1.1875, "visit_duration": 11121.47675, "visitors": 4, "visits": 16 } }, { "timestamp": "2024-03-20T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 46, "views_per_visit": 1.069767441860465, "visit_duration": 6586.031023255815, "visitors": 9, "visits": 43 } }, { "timestamp": "2024-03-21T00:00:00Z", "values": { "bounce_rate": 1, "pageviews": 6, "views_per_visit": 2, "visit_duration": 377.60966666666667, "visitors": 2, "visits": 3 } } ] }
Any news on the problem?
from vince.
Hi, I can't really reproduce this locally so It is hard for me to solve. I will try again this week to find ways to reproduce.
from vince.
@Ziedelth please be patient with with me, I will get back to you as soon as I have something. Meanwhile don't be discouraged to open any issue you encounter.
from vince.
No problem, take your time, I know what it's like to work alone on a project.
from vince.
Hi, any news on the problem?
from vince.
Hi @Ziedelth , just wanted to let you now I'm now focusing on the roadmap that will address this.
Can I ask how long have you had vince instance running? It will help me with planning migration path.
from vince.
Greetings!
So, I have an instance that has been running for 1 week without rebooting. When I restart the instance, the data is correctly added together.
But if it's the data you're interested in, the oldest dates back to March 13.
from vince.
But if it's the data you're interested in, the oldest dates back to March 13.
Thanks, this is what I was interested in.
from vince.
I am considering grouping events by buckets year
, month
,day
and hour
. So all api calls will be operating with expectation that computation doesn't reflect exact time an event occurred but rather the time bucket in which the event happened.
@Ziedelth what do you think?
from vince.
But by grouping them by hour
, will the timezone problem really be corrected?
from vince.
But by grouping them by hour, will the timezone problem really be corrected?
Theoretical, yes. I think we can default to UTC. We will test when I have the implementation ready and see. I'm not so sure though.
Maybe we can just group by year
. month
and day
? Plausible and other open source tools do this.
What do you think?
from vince.
Yeah it's a good choice!
from vince.
Yeah it's a good choice!
Which one?
from vince.
Related Issues (8)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vince.