atorus-research / metacore Goto Github PK
View Code? Open in Web Editor NEWHome Page: https://atorus-research.github.io/metacore/
License: Other
Home Page: https://atorus-research.github.io/metacore/
License: Other
When using spec_type_to_ds_vars()
with the key_seq
variable in the main sheet it will not get passed through to create_table
and will not get added on later. It looks as though only the cols
argument is passed to create_table()
therefore the key_seq
variable will not get passed through and added on.
I currently do not have the ability to create a branch. I wanted to make the changes to yn_to_tf to address Issue #57 by creating a branch off of dev, modifying the code and doing a PR to get my changes into dev.
Left is "main" branch, right is "dev" branch. The change was made to address Issue #47 .
There needs to be an additional, small change made to the regex:
if(all(is.na(x) | str_detect(x, regex("^y$|^n$|yes$|no$", ignore_case = T)))){
case_when(str_detect(x, regex("^y$|yes$", ignore_case = T)) ~ TRUE,
str_detect(x, regex("^n$|no$", ignore_case = T)) ~ FALSE,
"ABCy" would be set to NA. But "ABCyes" would be set to TRUE. Same with "ABCn" vs "ABCno" being set to NA vs FALSE.
At the moment multiple issues are put into a single warning message. For example
Warning messages:
1: order from the ds_vars table only contain missing values.
keep from the ds_vars table only contain missing values.
supp_flag from the ds_vars table only contain missing values.
sig_dig from the value_spec table only contain missing values.
dataset from the supp table only contain missing values.
variable from the supp table only contain missing values.
idvar from the supp table only contain missing values.
qeval from the supp table only contain missing values.
This makes it hard to filter out warnings which are not of interest. For example, we have implemented a function to create a metacore object from the Roche standard specifications. As the Roche specs do not include significant digits, we would like to suppress the corresponding warning.
Could metacore create separate warnings for each detected issue?
Then it would be easy to suppress warnings which do not apply by calling admiraldev::suppress_warning()
. For example:
suppress_warning(
metacore(
ds_spec = ds_spec,
ds_vars = ds_vars,
var_spec = var_spec,
value_spec = value_spec,
derivations = derivations,
codelist = codelist
),
regexpr = "sig_dig from the value_spec table only contain missing values."
)
Remove this function. It will live in another package
@statasaurus we had success using 0.0.4 version of metacore on CRAN, but when we tried our xls spec reader on latest 0.0.6 dev version we got:
Caused by error in .data$predecessor
:
! Column predecessor
not found in .data
.
I thought the latest updates were only focused on define.xml compatibility - did something else change around a "predecessor" field?? We'd need to account for this in our spec reader function in admiralroche.
Hi, I have been testing/started considering the metacore/metatools packages for ADaM work and downloaded the latest versions from github this week, following the readme instructions. I would like to report potential issues related to building metacore from Excel specs (following the P21 Excel file formats) and related to importing from define-xml. I have regrouped in one post as these are all potential issues related to importing functionality.
spec_to_metacore (using the old P21 Excel file format - with a WhereClauses sheet): issue with importing composite where clauses (i.e. conjunctive clauses) like: PARAMCD EQ "CUMPLDOS" and AVISIT EQ "POST TREATMENT FOLLOW-UP 2". Such a composite Where clause is split into 2 records in the WhereClauses sheet, bearing the same ID and holding each a base clause. After importing, 2 records are created in VALUE_SPEC, one for each base clause. I would expect a single record instead in VALUE_SPEC with the composite clause rebuilt. Likely the issue is in spec_type_to_value_spec(... where_sep_sheet = TRUE... ).
define_to_metacore : similar issues with importing composite where clauses. In define-xml, such where clauses are also split into basic clauses. While importing from define, one record is created in VALUE_DESC but only the first part of the composite where clause is imported, the rest is ignored. So, the issue is related but slightly different than 1).
spec_to_metacore (using the new P21 Excel file format - without the WhereClauses sheet and the where clauses directly in ValueLevel): fails to import and throws errors. However, import works following the stepwise approach with the individual spec_type_to_ functions. I assume spec_to_metacore is tweaked towards the old P21 Excel file format. However, considering that the new P21 Excel file format is now used in recent releases of the P21 Enterprise and Community tools, it would be nice that the function can import directly the newer format as well. Note: here the composite where clauses import well in VALUE_SPEC - which may be expected as they are already on a single line in the input file.
spec_to_metacore: In the P21 Excel file format, for variables/VLM with Origin = "Predecessor", the details are stored in the Predecessor column, for variables with Origin = "Assigned", the details are stored in the Comment column. The import function only handles Origin = "Derived" and reads the details from the Method column into DERIVATIONS. The predecessor and assignment details are lost during import. Looking at the metacore_example("pilot_ADaM.rda"), all the predecessor and assignment details are stored in the DERIVATIONS table. Technically as a VALUE_SPEC record can have only a single Origin, I understand that the origin details may always be stored in the DERIVITIONS table whatever the Origin - as shown in pilot_ADaM.rda. But then it would make sense for spec_to_metacore (or spec_type_to_value_spec) to be able to map the Predecessor column or Comment column into the DERIVATIONS table as well, according to the value of Origin.
define_to_metacore: faces a similar issue to 4) when Origin="Predecessor" or "Assigned", the detailed origin description is also lost and not stored in the DERIVATIONS table.
Data fields have been updated - text is accurate. So now need to update the graphics. @MayaGans whenever you have some time to get to this :)
It seems that metacore requires dplyr 1.0.0. I tried to use it with dplyr 0.8.5 and it failed. After updating dplyr to 1.0.9 it worked.
See pharmaverse/pharmaverse#177 - FYI @statasaurus opened here as i closed the pharmaverse issue, so that you have it on radar.
When I call spec_type_to_value_spec in this manner:
value_spec <- spec_type_to_value_spec(
doc,
cols = c(
dataset = "DATASET",
variable = "VARIABLE",
origin = "ORIGIN",
type = "VLM_TYPE",
code_id = "CONTROL_TERM_NAME",
where = "VALUE",
derivation_id = "COMP_METHOD_NAME",
sig_dig = "SIG_DIGITS"
),
where_sep_sheet = FALSE,
sheet = "VLM"
)
... I receive the following:
Error in `mutate()`:
! Problem while computing `derivation_id = case_when(...)`.
Caused by error in `.data$predecessor`:
! Column `predecessor` not found in `.data`.
Run `rlang::last_error()` to see where the error occurred.
I was told that this is due to my not having the predecessor parameter in my call. If so, the spec_type_to_value_spec function should be checking that this parameter is present and, if not, inform the user with a meaningful message and exit gracefully.
We should get started with a NEWS.md file to track and document changes between version. As a reference, see here
Error without explanation. I was not able to create derivation.
derivation <- spec_type_to_derivations(
doc,
cols = c(derivation_id = "ID", derivation = "[D|d]efinition|[D|d]escription"),
sheet = "Method|Derivations?",
var_cols = c(dataset = "[D|d]ataset|[D|d]omain", variable = "[N|n]ame|[V|v]ariables?",
origin = "[O|o]rigin", predecessor = "[P|p]redecessor", comment = "[C|c]omment")
)
@statasaurus can we put in a function that returns that Pilot AdaM metacore object? We have metacore_example() which gives file paths. But what about like a load_metacore_adam_example() or something much less verbose that returns the demo object?
I was looking for this when I was unit testing on metatools. It would even be a nice touch to have the individual tables as dataframes available so you can pull them into memory quick and create a metacore object.
Test at "A" in screenshot is checking for 'Yes' or 'yes' or 'No' or 'no' or 'Y' or 'y' or 'N' or 'n', so a 'Y' or 'N' would pass the test. But then at "B" in screenshot, the values of only 'Yes' or 'yes' are assigned TRUE and the values of only 'No' or 'no' are assigned FALSE. 'Y', 'y', 'N', 'n' are assigned to NA.
I discovered this because spec_type_to_ds_vars() calls yn_to_tf() for the keep variable.
Also, a value of 'YES' or 'NO' is unacceptable and triggers the Warning message in the "else" block of the yn_to_tf function.
Current spec_type_to_var_spec
combines multiple rows into one when variable, length, label, type
are the same, but formats are different, e.g., below two rows from Variables
sheet are considered as one row, so one of the formats is dropped:
Order | Dataset | Variable | Label | Data Type | Length | Significant Digits | Format |
---|---|---|---|---|---|---|---|
28 | ADPFT | PARAMCD | Parameter Code | Text | 8 | $PFT | |
22 | ADAETTE | PARAMCD | Parameter Code | Text | 8 | $AETTE |
var_spec
:
variable | length | label | type | format | common |
---|---|---|---|---|---|
PARAMCD | 8 | Parameter Code | Text | $PFT | FALSE |
Should format
be considered when removing duplicates in spec_type_to_var_spec
?
(Can use the r/pharma workshop excel to test this)
Error without explanation. I was not able to create value_spec:
value_spec <- spec_type_to_value_spec (
doc,
cols = c(dataset = "[D|d]ataset|[D|d]omain", variable = "Variable",
origin = "[O|o]rigin", type = "[T|t]ype", code_id = "[C|c]odelist|Controlled Term",
sig_dig = "[S|s]ignificant", where = "[W|w]here", derivation_id = "[M|m]ethod",
predecessor = "[P|p]redecessor"),
sheet = NULL,
where_sep_sheet = TRUE,
where_cols = c(id = "ID", where = c("Variable", "Comparator", "Value")),
var_sheet = "Variables"
)
With dev tidyselect there are a bunch of failures like:
Error (test-metacore.R:68:4): metacore wrapper function works
<dplyr:::mutate_error/rlang_error/error/condition>
Error in `mutate(., key_seq = str_split(key_seq, ",\\s"), key_seq = map(key_seq,
function(x) {
tibble(variable = x) %>% mutate(key_seq = row_number())
}))`: Problem while computing `key_seq = str_split(key_seq, ",\\s")`.
Caused by error in `vctrs::vec_size_common()`:
! object 'key_seq' not found
This comes down to a fairly subtle problem create_tbl()
used a named argument to tidyselect::matches()
which used to work by conicidence, but not longer does.
Add option to read define quitely, this will be very helpful for examples
Going through the README.md, I noticed some things which may need to be fixed:
General: the diagrams have the df and field names in upper case; all should be lower case?
ds_spec:
ds_vars: Only 5 out of 7 fields are shown in the diagram; order and supp_flag are not shown.
var_spec:
" ... contains the information the purely variable level information" -> "... contains the purely variable-level information"
Only 5 out of 6 fields are shown in the diagram; format is not shown.
derivation:
codelist:
supp: "To see a metacore object in about please see our vignettes" -> "To see more information about the metacore object, please see the vignettes." ?
The first release of {gt} was 2020-03-31. Our github workflow is set up right now to test on a R 3.6 snapshot from 2020-02-29 and a R 3.5 snapshot from earlier. This causes a build error because {gt} doesn't exist on the MRAN snapshot from that time period.
This is a simple fix as we just need to remove the use of {gt} and replace with another HTML table package.
If a derivation is used by multiple variables in a dataset the derivation gets duplicated in the deviation table
Line 226 in c583332
@mstackhouse is it possible to limit the origin options to eCRF, Derived, Predecessor, and Assigned? Are those all the options?
/vignettes/Building_Specification_readers.Rmd:
Line 129 of /R/validators.R:
when i run metacore, i'm getting the following warning message even though all core values are Required/Expected/Permissible:
The following words in ds_vars$core are not allowed:
NA
> metacore$ds_vars %>%
+ filter(core == "NA")
# A tibble: 0 × 7
Have you seen this before? Any ideas?
The idea of centralizing metadata into a relational data structure is a great one, although right now the current model only accounts for a "slice" of possible metadata storage within an R session. Think of one metacore object per dataset type (e.g. SDTM x ADAM), Task, Study, etc. One could of course just loop over the different dimensions and store individual metacore objects in a list, but some of this information could also be natively incorporated into the model. Are you currently discussing on how to expand the structure to account for these other dimensions or is this not part of the scope of the project?
In spec_type_to_ds_vars()
if the keep variable is already a boolean it causes a failure when attempting to build the ds_vars
since yn_to_tf()
is only looking for something that contains Yes/No
I just tried to install {metacore} into our validated R environment at Roche which still uses {dplyr} 0.8.5. Unfortunately, the installation failed.
Error: object ‘c_across’ is not exported by 'namespace:dplyr'
Execution halted
ERROR: lazy loading failed for package ‘metacore’
* removing ‘/home/bceuser/neitmant/admiral.roche/renv/staging/1/metacore’
Could you adjust your code such that it works with older dplyr versions?
I'd be happy to edit the matrix config of your R CMD check
workflow such that it runs also on earlier R versions with older packages. I did that for {admiral} and it proved to be very valuable for ensuring that the package works on our kind of old production environment.
Just noticed that:
http://cran.r-project.org/package=metacore
Currently yields:
Package ‘metacore’ was removed from the CRAN repository.
Formerly available versions can be obtained from the [archive](https://cran.r-project.org/src/contrib/Archive/metacore).
Archived on 2023-03-01 as issues were not corrected despite reminders.
A summary of the most recent check results can be obtained from the [check results archive](https://cran-archive.r-project.org/web/checks/2023/2023-03-01_check_results_metacore.html).
Please use the canonical form https://cran.r-project.org/package=metacore to link to this page.
metacore$var_spec %>%
mutate(var1 = str_remove(variable, "[[:alnum:]]+\\.")) %>%
group_by(var1) %>%
mutate(n_lab = n_distinct(label)) %>%
filter(n_lab > 1)
Greetings,
I was looking over things and the hex image seems to be super large when using Edge - not sure if you all see this on your browsers.
@statasaurus Thanks again for helping with building the Roche spec reader. After making some minor adjustments the reader now works properly and I incorporated it within an internal package. However, as the reader function makes use of the un-exported create_tbl()
function I get a Note when running R CMD check
which is not acceptable according to our package validation guidelines. Thus, please export this function. Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.