ryapric / loggit Goto Github PK
View Code? Open in Web Editor NEWModern Logging for the R Ecosystem
License: Other
Modern Logging for the R Ecosystem
License: Other
Thanks for the great tool. As far as I can see in the source code, the log files are not automatically rotated. I would suggest the following implementation:
rotate_lines=NULL
entry to the .config
environmentset_rotate_lines
and get_rotate_lines
loggit
can check if .config$rotate_lines
is set, and if it is set to non-NULL, rotate the logsThis would leave the implementation backwards compatible, but would allow using {loggit} in long running sessions where rotate_lines
limit is likely to be exceeded, e.g. Plumber APIs etc.
I am happy to work on a PR if this is something you can see as a useful contribution.
Also remove the limitations it imposes in loggit.R
, and the testthat
bypass in test_loggit.R
.
First of all, thank you for this package!
Now, for the problem I encounter:
library(loggit)
library(glue)
loggit("INFO", "First message")
loggit("INFO", "Second message")
# Warning message:
# In bind_rows_(x, .id) :
# Vectorizing 'glue' elements may not preserve their attributes
This warning pops up because of bind_rows
because glue::glue
gives an object of class glue
which is not present in the column of the dataframe.
I see two solutions to overcome this:
as.character(log_msg)
to automatically convert every log message (and log details) into charactersuppressWarnings(dplyr::bind_rows(...))
to avoid this type of warning.Note that I expect this kind of warning to happen not just with glue
but for all objects with other attributes than character
(dates
, durations
, ...)
Hi Ryan,
Is there a way to set rotating logs?
Loggit appends now, but wouldn't it be nice if you could rotate the logs?
Thanks for this simple and great package.
A common need is to register logs in different files by date. E.g. by day:
loggit-2021-03-10.log
loggit-2021-03-11.log
loggit-2021-03-12.log
...
It would be useful if the set_logfile
function allowed inserting date variables, e.g.:
set_logfile(logfile = 'loggit-%Y-%m-%d.log')
And automatically creates and logs to the file depending on the current date.
I'm available for a pull request if you consider adding this feature into the package.
Messages that contain an :
ββare not displayed correctly in the log but are cut off
> loggit::message("This won't: work")
{"timestamp": "2023-12-01T16:17:12+0100", "log_lvl": "INFO", "log_msg": "This won't: work"}
This won't: work
> loggit::read_logs()
timestamp log_lvl log_msg
1 2023-12-01T16:17:12+0100 INFO This won't
Thank you for this very useful tool! Unfortunately, rotate_logs()
will cause an error ("Error in xj[i] : only 0's may be mixed with negative subscripts") if there are less than rotate_lines
lines in the logfile.
I suggest the following fix: Replace lines 85-86 of loggit/R/utils.R with
if (nrow(log_df) > rotate_lines) {
log_df <- log_df[(nrow(log_df) - rotate_lines + 1):nrow(log_df), ]
write_ndjson(log_df, logfile, echo = FALSE, overwrite = TRUE)
}
In many of our production contexts, we log, usually in JSON, to STDOUT (or in rare cases STDERR), and have external log collectors that capture logs across many services.
The filename test in configurations.R
seems to prevent me from doing what I'd normally do for this:
# gives me a writeable file handler for the standard output stream
log_file <- file("stdout", "w")
I think it would be helpful to allow for these basic streams as output.
This is also in concert with the logging factor of the 12-factor app, which recommends:
A twelve-factor app never concerns itself with routing or storage of its output stream. It should not attempt to write to or manage logfiles. Instead, each running process writes its event stream, unbuffered, to stdout.
In order to allow for conformance to a corporate logging standard, it would be helpful to allow renaming the default log fields in a configuration block of some sort.
For example, Usually, I need to write log_lvl
with the key level
and log_msg
with the key msg
. While I can do that now by adding those keys at the end, it seems I'll I end up having extra keys I don't use.
Run some R CMD check
s to confirm which versions, but 3.6.0 failed, as did 3.4.0 (which is the minimum version currently in the DESCRIPTION
file).
Would you be interested in me filing a PR for a config setting that automatically replaces :
and \n
in messages before they're written? Any ideas what they would get replaced with? I'm using _
and ___
respectively in my logs. It would be up to you whether that's a default or not. I haven't been able to figure out why colons in some places are problematic and not others, but newlines always prevent reading back the logs.
I have a function that I'm logging that occasionally fails badly due to external data sources. I handle it by using purrr::safely
and retrying that item later. Usually that means that this function won't print anything to the console, just skip over that item. However, using this package, the deeply hidden stop(...)
calls instead manage to print to the log anyway.
Is there any way to avoid this? Is this something for the purrr
developers to fix?
library(loggit)
#> Warning: package 'loggit' was built under R version 3.6.2
#>
#> Attaching package: 'loggit'
#> The following objects are masked from 'package:base':
#>
#> message, stop, warning
noisy_fn <- function(x) {
if (x < 5) {message(x)}
if (x >= 5) {stop(x)}
x
}
purrr::map(3:6, purrr::safely(noisy_fn))
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "3"}
#> 3
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "4"}
#> 4
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "5"}
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "6"}
#> [[1]]
#> [[1]]$result
#> [1] 3
#>
#> [[1]]$error
#> NULL
#>
#>
#> [[2]]
#> [[2]]$result
#> [1] 4
#>
#> [[2]]$error
#> NULL
#>
#>
#> [[3]]
#> [[3]]$result
#> NULL
#>
#> [[3]]$error
#> <simpleError in stop(x): 5>
#>
#>
#> [[4]]
#> [[4]]$result
#> NULL
#>
#> [[4]]$error
#> <simpleError in stop(x): 6>
purrr::map(3:6, purrr::possibly(noisy_fn, otherwise = NA))
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "3"}
#> 3
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "INFO", "log_msg": "4"}
#> 4
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "5"}
#> {"timestamp": "2020-05-25T23:23:54-0400", "log_lvl": "ERROR", "log_msg": "6"}
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 4
#>
#> [[3]]
#> [1] NA
#>
#> [[4]]
#> [1] NA
Created on 2020-05-25 by the reprex package (v0.3.0)
To be clear, in both examples above, I'd expect the INFO messages to print since that's part of the "normal" flow, but not the ERROR messages. Below is the output if I don't use the loggit
library:
purrr::map(3:6, purrr::safely(noisy_fn))
#> 3
#> 4
#> [[1]]
#> [[1]]$result
#> [1] 3
#>
#> [[1]]$error
#> NULL
#>
#>
#> [[2]]
#> [[2]]$result
#> [1] 4
#>
#> [[2]]$error
#> NULL
#>
#>
#> [[3]]
#> [[3]]$result
#> NULL
#>
#> [[3]]$error
#> <simpleError in .f(...): 5>
#>
#>
#> [[4]]
#> [[4]]$result
#> NULL
#>
#> [[4]]$error
#> <simpleError in .f(...): 6>
purrr::map(3:6, purrr::possibly(noisy_fn, otherwise = NA))
#> 3
#> 4
#> [[1]]
#> [1] 3
#>
#> [[2]]
#> [1] 4
#>
#> [[3]]
#> [1] NA
#>
#> [[4]]
#> [1] NA
Created on 2020-05-25 by the reprex package (v0.3.0)
I'm in the process of changing over from handmade log files to a uniform automated system, and so far I like loggit
. However, a lot of my old scripts have lines like message("The number of rows written to db was: ", nrow(table))
. However, the version that you implement for masking the base message
ignores the multiple arguments and returns output like this:
> message("The number of rows: ", nrow(iris))
{"timestamp": "2020-05-25T16:50:48-0400", "log_lvl": "INFO", "log_msg": "The number of rows: "}
The number of rows: 150
The output to the console is correct! But the line written to the log only captures the first arg. It looks like this is an intentional design choice, because the source calls loggit
with args[[1]]
instead of something like paste(args, collapse = " ")
.
Is this changeable, or should I be adjusting my usage to match this behavior?
Should reflect goal and mission of the package now that v2.0.0 is almost out.
Even though this repo has been inactive for a long time, I hope that I can still help to improve a few things here.
Since loggits stop
, warning
and message
internally call their base equivalent, the loggit function is always displayed as call. This could be prevented by calling the R internal functions instead.
In the same way one could very easily support stopifnot
.
msg <- paste0("Package: smvgraph %s", utils::packageVersion("smvgraph"), "\n(C) 2022- Sigbert Klinke, HU Berlin")
loggit("DEBUG", msg, echo=FALSE)
leads in the loggit file to
{"timestamp": "2022-03-19T18:51:36+0100", "log_lvl": "DEBUG", "log_msg": "Package: smvgraph %s0.2.0__LF__(C) 2022- Sigbert Klinke__COMMA__ HU Berlin"}
But reading the log into R produces
read_logs()
timestamp log_lvl log_msg
1 2022-03-19T18:51:36+0100 DEBUG Package
In your README, you note:
If you really wish to have all exception messages logged by loggit, please be patient, as this feature is in the works.
I'm wondering about the timeline for implementation for this, as it's critical for me to be able to do that for this package to be useful for production work.
The way loggit works now is incredibly non-performant: in order to write a log entry, it must first read in the entire log file, append to the data.frame
representation, and then write the whole thing back out. Switching to ndjson
will retain the JSON format, but allow for separation of concerns on a line-by-line basis. This will make writes negligibly fast, and infinitely scalable (up to available disk space).
stopifnot
is also a base condition function and should therefore be supported
Thank you for this great logging package, and for considering this feature request!
It would be great to be able to read a remote log file from a URL, instead of a local filepath, e.g., queries <- read_logs("https://raw.githubusercontent.com/USER/REPO/main/queries.log")
Since echo
is TRUE
by default, the console is easily spammed with no new information, so I would recommend changing this.
This would also have the nice side effect that you could add a log to packages by simply importing loggit without changing the other behavior (in combination with my issue #23).
Alternatively, it would also be possible to introduce an option with default TRUE
if backward compatibility is to be guaranteed.
Current behavior is a consequence of R's vectorization. This might take some fudging, since R would then read in the values as arrays into a single df cell, vs. separate rows.
I've grown to be wary of external dependencies, both in terms of potential breakage but also in terms of install time/size. A user who just wants to log their script/package entries shouldn't need to spend 20 minutes compiling C/C++ packages (on a Linux deployment host) just to enable that feature. loggit
is currently using dplyr
only for its bind_rows()
functionality, which is easy to replicate in base R. It's leaning heavily into jsonlite
though, and that will take some fudging on the read-in-the-data side of things. But writes should be easy to roll my own.
Vignette topics:
Automated data validation
Logging to stdout
(user will need to wrap calls to the right suppressor function, if not calling loggit()
directly)
To make the console output look cleaner one could replace all the print
s using paste
if (confirm) print(paste0("Log file set to ", logfile))
with cats
if (confirm) cat("Log file set to ", logfile, "\n")
In the current loggit.R, echoing of log message is handled by write_ndjson()
write_ndjson(log_df, echo = echo)
Line 83 in 5399852
which in turn calls cat() if echo = T.
if (echo) cat(logdata, sep = "\n")
Line 102 in 5399852
Previously, in version 1.1.1 (https://cran.r-project.org/src/contrib/Archive/loggit/loggit_1.1.1.tar.gz), if echo = T, the base message function is called in loggit.R.
if (echo) base::message(paste(c(log_lvl, log_msg), collapse = ": "))
Switching to from message() to cat() causes loggit output to console to be suppressed during R markdown rendering as knitr::knit_hooks has options to handle message() output but nothing to handle cat() output (https://bookdown.org/yihui/rmarkdown-cookbook/output-hooks.html).
Here is an example code snippet (I replaced the markdown backticks with a single quote because I can't figure out a way not to confused the code blocking):
'''{r setup, include=FALSE}
library(loggit)
'''
'''{r test message, echo=F,message=F, warning=F}
loggit("INFO", "loggit message", echo = T)
message('base message\n')
'''
Only "base message" will be printed to console during rendering.
I suggest switching back from cat() to message() when echo = T.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.