Code Monkey home page Code Monkey logo

datacomparer's People

Contributors

krishanbhasin avatar mend-bolt-for-github[bot] avatar rjli13 avatar robne1982 avatar sajohnston avatar sclewis23 avatar tmbjmu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datacomparer's Issues

saveReport shows only 5 sample row

Is it possible to increase the actual different row that the saveReport comes up with? instead of current 5-row preview for each variable? Can a user expand the list when knit the report?

ex:

test <- rCompare(df1, df2 ,keys = 'id' )
saveReport(test, reportName = 'test' , n = 20)

*n – The first n different rows

Test failures on R 4.0.0 pre-release (win-builder check)

The results of using the win-builder for the upcoming R release (4.0.0) had some tests now failing, though they passed on the current version and on 3.5.3.

It sounds like 4.0.0 will be coming out tomorrow, so these should be addressed before releasing/submitting to the CRAN.

 -- 1. Failure: Coercion wrapper function (@testCoercion.R#92)  -----------------
  executeCoercions(Fac, WSF, T) not equal to `Ret3`.
  Component "DataTypes": Component "numeric": 1 string mismatch
  Component "DataTypes": Component "character": 2 string mismatches
  
  -- 2. Failure: Coercion wrapper function (@testCoercion.R#93)  -----------------
  executeCoercions(Fac, WSF, F) not equal to `Ret4`.
  Component "DataTypes": Component "numeric": 1 string mismatch
  Component "DataTypes": Component "character": 2 string mismatches
  
  -- 3. Failure: Coercion wrapper function (@testCoercion.R#94)  -----------------
  executeCoercions(WSF, Fac, F) not equal to `Ret5`.
  Component "DataTypes": Component "numeric": 1 string mismatch
  Component "DataTypes": Component "character": 2 string mismatches
  
  -- 4. Failure: Coercion wrapper function (@testCoercion.R#95)  -----------------
  executeCoercions(WS, WSF, T) not equal to `Ret6`.
  Component "DataTypes": Component "character": 1 string mismatch
  
  -- 5. Failure: ComparisonOfEquals (@testEndToEndFourKeys.R#63)  ----------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 6. Failure: ComparisonOfUnEquals (@testEndToEndFourKeys.R#107)  -------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 7. Failure: ComparisonOfMissRows (@testEndToEndFourKeys.R#147)  -------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 8. Failure: ComparisonOfMissCols (@testEndToEndFourKeys.R#188)  -------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 9. Failure: ComparisonOfEquals (@testEndToEndTwoKeys.R#59)  -----------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 10. Failure: ComparisonOfUnEquals (@testEndToEndTwoKeys.R#99)  --------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 11. Failure: ComparisonOfMissRows (@testEndToEndTwoKeys.R#135)  -------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  -- 12. Failure: ComparisonOfMissCols (@testEndToEndTwoKeys.R#172)  -------------
  length(ABcomparison$cleaninginfo$COLOR) not equal to 4.
  1/1 mismatches
  [1] 0 - 4 == -4
  
  == testthat results  ===========================================================
  [ OK: 999 | SKIPPED: 3 | WARNINGS: 0 | FAILED: 12 ]
  1. Failure: Coercion wrapper function (@testCoercion.R#92) 
  2. Failure: Coercion wrapper function (@testCoercion.R#93) 
  3. Failure: Coercion wrapper function (@testCoercion.R#94) 
  4. Failure: Coercion wrapper function (@testCoercion.R#95) 
  5. Failure: ComparisonOfEquals (@testEndToEndFourKeys.R#63) 
  6. Failure: ComparisonOfUnEquals (@testEndToEndFourKeys.R#107) 
  7. Failure: ComparisonOfMissRows (@testEndToEndFourKeys.R#147) 
  8. Failure: ComparisonOfMissCols (@testEndToEndFourKeys.R#188) 
  9. Failure: ComparisonOfEquals (@testEndToEndTwoKeys.R#59) 
  1. ...

Majority of functions missing while installing package : dataCompareR

I am using R version 3.3.2 .
when i install dataCompareR from CRAN, i get only 3 functions namely :

  1. rCompare
  2. generateMismatchData
  3. saveReport
    I need other functions from the package
    I tried installing using
    library(devtools)
    install_git('https://github.com/capitalone/dataCompareR.git', branch = 'master',
    subdir = 'dataCompareR', type = 'source', repos = NULL)

the package gets installed but when i load it i get this error :
library(dataCompareR)
Error in fetch(key) :
lazy-load database 'C:/.../Documents/R/win-library/3.3/dataCompareR/help/dataCompareR.rdb' is corrupt

Error when unequal number of rows

I'm looking for a way to find mismatches between two data frames that may have an unequal number of rows. Something along the lines you might get from running anti_join(df1, df2) followed by anti_join(df2, df1). I hoped that dataCompareR would do this, but apparently it's not possible.

df2 <- tibble(col1 = c("cat", "dog", "mouse", "fly"))
df1 <- tibble(col1 = c("cat", "dog", "rat"))
dataCompareR::rCompare(df1, df2)
Running rCompare...
Coercing input data to data.frame
Error in (nrow(df_a_subset) + 1):nrow(df_a) : argument of length 0

What do you think about adding this functionality to dataCompareR?
Or maybe I'm missing some other obvious way to do this kind of comparison?

Help file build generates warnings

Think we still have some misformed tags on help files. Not sure why this is appears to only happen sometimes, but have seen it on two systems. Should be a simple fix

Package generates warnings due to new version of dplyr

## `mutate_each()` is deprecated.
## Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
## To map `funs` over all variables, use `mutate_all()`
## `mutate_each()` is deprecated.
## Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
## To map `funs` over all variables, use `mutate_all()`
## `mutate_each()` is deprecated.
## Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
## To map `funs` over all variables, use `mutate_all()`
## `mutate_each()` is deprecated.
## Use `mutate_all()`, `mutate_at()` or `mutate_if()` instead.
## To map `funs` over all variables, use `mutate_all()`

CI failing

The CI is failing to start with:

sh -e /etc/init.d/xvfb start

sh: 0: Can't open /etc/init.d/xvfb

The command "sh -e /etc/init.d/xvfb start" failed and exited with 127 during .

I set up xvfb start when we first set the CI up years ago to get the tests running in "headless" mode based on how Travis was configured to run R back then.

I'm guessing things have moved on since then, I'll take a look at how to get them running properly again later.

CVE-2020-11023 (Medium) detected in jquery-3.4.1.min.js

CVE-2020-11023 - Medium Severity Vulnerability

Vulnerable Library - jquery-3.4.1.min.js

JavaScript library for DOM operations

Library home page: https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js

Path to dependency file: dataCompareR/reference/createCleaningInfo.html

Path to vulnerable library: dataCompareR/reference/createCleaningInfo.html

Dependency Hierarchy:

  • jquery-3.4.1.min.js (Vulnerable Library)

Found in HEAD commit: 567a64e178266fdcb9b927190a300696c2430033

Vulnerability Details

In jQuery versions greater than or equal to 1.0.3 and before 3.5.0, passing HTML containing elements from untrusted sources - even after sanitizing it - to one of jQuery's DOM manipulation methods (i.e. .html(), .append(), and others) may execute untrusted code. This problem is patched in jQuery 3.5.0.

Publish Date: 2020-04-29

URL: CVE-2020-11023

CVSS 3 Score Details (6.1)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: Required
    • Scope: Changed
  • Impact Metrics:
    • Confidentiality Impact: Low
    • Integrity Impact: Low
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-11023

Release Date: 2020-04-29

Fix Resolution: jquery - 3.5.0


Step up your Open Source Security Game with WhiteSource here

CVE-2018-14040 (Medium) detected in bootstrap-3.3.5.min.js, bootstrap-3.3.5.js

CVE-2018-14040 - Medium Severity Vulnerability

Vulnerable Libraries - bootstrap-3.3.5.min.js, bootstrap-3.3.5.js

bootstrap-3.3.5.min.js

The most popular front-end framework for developing responsive, mobile first projects on the web.

Library home page: https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.min.js

Path to vulnerable library: /packrat/lib/x86_64-pc-linux-gnu/3.4.4/rmarkdown/rmd/h/bootstrap/js/bootstrap.min.js

Dependency Hierarchy:

  • bootstrap-3.3.5.min.js (Vulnerable Library)
bootstrap-3.3.5.js

The most popular front-end framework for developing responsive, mobile first projects on the web.

Library home page: https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.js

Path to vulnerable library: /packrat/lib/x86_64-pc-linux-gnu/3.4.4/rmarkdown/rmd/h/bootstrap/js/bootstrap.js

Dependency Hierarchy:

  • bootstrap-3.3.5.js (Vulnerable Library)

Found in HEAD commit: 567a64e178266fdcb9b927190a300696c2430033

Vulnerability Details

In Bootstrap before 4.1.2, XSS is possible in the collapse data-parent attribute.

Publish Date: 2018-07-13

URL: CVE-2018-14040

CVSS 3 Score Details (6.1)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: Required
    • Scope: Changed
  • Impact Metrics:
    • Confidentiality Impact: Low
    • Integrity Impact: Low
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: twbs/bootstrap#26630

Release Date: 2018-07-13

Fix Resolution: org.webjars.npm:bootstrap:4.1.2,org.webjars:bootstrap:3.4.0


Step up your Open Source Security Game with WhiteSource here

Use identical() for any equality checks

(Issue ported from another server - other people were involved in the initial conversation!)

In hindsight perhaps it would have been a good idea to read through the R documentation first...

The way object comparison is implemented in R is a bit odd, leading to "special" moments such as this:

> "1" == 1
[1] TRUE

This behaviour makes a bit more sense having looked through the relevant documentation:

In particular the following quote from the comparison page:

Do not use == and != for tests, such as in if expressions, where you must get a single TRUE or FALSE. Unless you are absolutely sure that nothing unusual can happen, you should use the identical function instead.

Consider also the following description of the identical() function:

The safe and reliable way to test two objects for being exactly equal. It returns TRUE in this case, FALSE in every other case.

... then again, the project is called rcompare, not ridentical ;)

Improve unit test hygene

When running the unit tests:

  • the console is spammed with dataframes and other output
  • the environment gains numerous datasets that weren't there before
  • there's a warning encoding is deprecated; all files now assumed to be UTF-8

All of these can be remedied!

Inconsistent capitalization

Running

rm(list=ls())

library(dataCompareR)


#dataCompare will match data frames (or any objects that can be coerced to data frames) - this is part of the package (uses as.data.frame)

#Lets use iris in the first example
head(iris)

#Create a new data frame to use in dataCompareR

#Make a copy of iris
iris2 <- iris

#Change it by first subsetting just to the first 140 rows:
iris2 <- iris[1:140,]

#then remove Petal.Width column
iris2$Petal.Width <- NULL

#and then change some values
iris2[1:10,1] <- iris2[1:10,1] + 1

#Comparison without a key:
#Rows are matched based on order: if the dataframes have different number of rows then rows will be dropped from the larger data frame
#This will be recorded in the output

#Run the comparison
compIris <- rCompare(iris,iris2)


summary(compIris)

Results in

Columns only in iris: Petal.Width  
Columns in both : PETAL.LENGTH, SEPAL.LENGTH, SEPAL.WIDTH, SPECIES 

We're kinda stuck with capitals as we have no easy way of recovering the original case, but the first row should be capitals too!

Summary fails if tables are pulled out of named lists.

I found a bit of an edge case. Summary fails if the original data frames are passed from a named list. Maybe the table names need to be sanitized before returning the comp object? BTW, I have removed the dplyr deprecated function notices from the reprex output for clarity.

library(tibble)
library(dataCompareR)

table1 <- tribble(~A, ~B, ~C,
                   1,  2,  3,
                   2,  6,  7)

table2 <- tribble(~A, ~D, ~C,
                   1,  2, 19,
                   2,  6,  7)

lis <- list(table1 = table1, table2 = table2)

comp1 <- rCompare(table1, table2, keys = "A")
#> Running rCompare...
#> Coercing input data to data.frame

summary(comp1)
#> dataCompareR is generating the summary...
#> 
#> Data Comparison
#> ===============
#> 
#> Date comparison run: 2020-11-13 13:00:13  
#> Comparison run on R version 4.0.3 (2020-10-10)  
#> With dataCompareR version 0.1.3  
#> 
#> 
#> Meta Summary
#> ============
#> 
#> 
#> |Dataset Name |Number of Rows |Number of Columns |
#> |:------------|:--------------|:-----------------|
#> |table1       |2              |3                 |
#> |table2       |2              |3                 |
#> 
#> 
#> Variable Summary
#> ================
#> 
#> Number of columns in common: 2  
#> Number of columns only in table1: 1  
#> Number of columns only in table2: 1  
#> Number of columns with a type mismatch: 0  
#> Match keys : 1   - A
#> 
#> 
#> Columns only in table1: B  
#> Columns only in table2: D  
#> Columns in both : A, C  
#> 
#> Row Summary
#> ===========
#> 
#> Total number of rows read from table1: 2  
#> Total number of rows read from table2: 2    
#> Number of rows in common: 2  
#> Number of rows dropped from table1: 0  
#> Number of rows dropped from  table2: 0  
#> 
#> 
#> Data Values Comparison Summary
#> ==============================
#> 
#> Number of columns compared with ALL rows equal: 0  
#> Number of columns compared with SOME rows unequal: 1  
#> Number of columns with missing value differences: 0  
#> 
#> 
#> 
#> Summary of columns with some rows unequal: 
#> 
#> 
#> 
#> |Column |Type (in table1) |Type (in table2) | # differences|Max difference | # NAs|
#> |:------|:----------------|:----------------|-------------:|:--------------|-----:|
#> |C      |double           |double           |             1|16             |     0|
#> 
#> 
#> 
#> Unequal column details
#> ======================
#> 
#> 
#> 
#> #### Column -  C
#> 
#> 
#> 
#> |  A| C (table1)| C (table2)|Type (table1) |Type (table2) | Difference|
#> |--:|----------:|----------:|:-------------|:-------------|----------:|
#> |  1|          3|         19|double        |double        |        -16|

comp2 <- rCompare(lis$table1, lis$table2, keys = "A")
#> Running rCompare...
#> Coercing input data to data.frame

summary(comp2)
#> dataCompareR is generating the summary...
#> Warning in matrix(c(object$meta$A$name, object$meta$A$rows,
#> object$meta$A$cols, : data length [10] is not a sub-multiple or multiple of the
#> number of columns [3]
#> Error in names(x) <- value: 'names' attribute [7] must be the same length as the vector [3]

Created on 2020-11-13 by the reprex package (v0.3.0)

CVE-2018-20677 (Medium) detected in bootstrap-3.3.5.min.js, bootstrap-3.3.5.js

CVE-2018-20677 - Medium Severity Vulnerability

Vulnerable Libraries - bootstrap-3.3.5.min.js, bootstrap-3.3.5.js

bootstrap-3.3.5.min.js

The most popular front-end framework for developing responsive, mobile first projects on the web.

Library home page: https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.min.js

Path to vulnerable library: /packrat/lib/x86_64-pc-linux-gnu/3.4.4/rmarkdown/rmd/h/bootstrap/js/bootstrap.min.js

Dependency Hierarchy:

  • bootstrap-3.3.5.min.js (Vulnerable Library)
bootstrap-3.3.5.js

The most popular front-end framework for developing responsive, mobile first projects on the web.

Library home page: https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.js

Path to vulnerable library: /packrat/lib/x86_64-pc-linux-gnu/3.4.4/rmarkdown/rmd/h/bootstrap/js/bootstrap.js

Dependency Hierarchy:

  • bootstrap-3.3.5.js (Vulnerable Library)

Found in HEAD commit: 567a64e178266fdcb9b927190a300696c2430033

Vulnerability Details

In Bootstrap before 3.4.0, XSS is possible in the affix configuration target property.

Publish Date: 2019-01-09

URL: CVE-2018-20677

CVSS 3 Score Details (6.1)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: Required
    • Scope: Changed
  • Impact Metrics:
    • Confidentiality Impact: Low
    • Integrity Impact: Low
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-20677

Release Date: 2019-01-09

Fix Resolution: Bootstrap - v3.4.0;NorDroN.AngularTemplate - 0.1.6;Dynamic.NET.Express.ProjectTemplates - 0.8.0;dotnetng.template - 1.0.0.4;ZNxtApp.Core.Module.Theme - 1.0.9-Beta;JMeter - 5.0.0


Step up your Open Source Security Game with WhiteSource here

Error on empty data.frames

Hey,

thnaks for the package this is very use ful and very handy - we love the summary and the reporting!

What irritates me is the following:

I have two data.frames, e.g.:

library(dataCompareR)

df_1 <- data.frame(a = character(0), b = integer(0))
df_2 <- data.frame(a = character(0), b = integer(0))

rCompare(df_1, df_2)
## Running rCompare...
## Error in checkEmpty(df1)  : ERROR : One or more dataframes are empty

Obviously this is not a bug but intended behaviour (right?) BUT I would argue that

  1. both data.frames are valid
  2. they are equal (same columns, same data). Why impose on the user that data is only valid if its filled?

I would suggest to either redesign the function to make it handle 0 row data.frames just like any other data.frame or allow the user to prevent this error by setting a parameter (e.g.: rCompare(df_1, df_2, do_not_error_on_emty_df = TRUE)).

What do you think?

Codeowners file

Please create a codeowners file and add trusted reviewers to it and to the project write team if necessary. Thanks!

Unhelpful error message if there are no columns to compare

Not particularly important, but when I was working on #71 I tried to run the following

df2 <- tibble(col1 = c("cat", "dog", "mouse", "fly"))
df1 <- tibble(col1 = c("cat", "dog", "rat"))

rCompare(df1, df2, keys = "col1")

Clearly this isn't smart, as without col1, there's nothing left to compare.

However, the output isn't clear

Running rCompare...
Coercing input data to data.frame
 Error in if (nrow(DFA) == 0) { : argument is of length zero 

It would be better to catch this and output a friendly error message.

Some missing links in help files

Rd warning: /tmp/Rtmpdyoe8U/file782c5f5cf1b1/dataCompareR/man/validateArguments.Rd:14: missing file link ‘round’

Rd warning: /tmp/Rtmpdyoe8U/file782c5f5cf1b1/dataCompareR/man/processFlow.Rd:15: missing file link ‘round’
    rCompare                                html  
Rd warning: /tmp/Rtmpdyoe8U/file782c5f5cf1b1/dataCompareR/man/rCompare.Rd:19: missing file link ‘round’

Should be fixed before CRAN submission.

Update test coverage check in next release

This step was overlooked for the current release, but isn't essential in terms of checking that things work. It'll be good to have an up-to-date one for the next release and to do it as part of the release process.

v0.1.2 CRAN release

CRAN is on summer holiday until next week!

Therefore, to avoid repeating steps again, I will document the pre-upload checks here, and attach the source so we can easily upload once the submission page is back.

I have

  • cloned from master
  • Got the latest version of R and packages (on windows)
  • Ran
    • devtools::test()
    • devtools::document()
    • devtools::check()
    • devtools::build()
  • submitted to https://win-builder.r-project.org/
  • uploaded source here
    dataCompareR_0.1.2.tar.gz
  • installed the package from source, with no warnings
  • checked and we have no reverse depends

Evidence below.

To do list:

  • Get winbuilder results from the OSO mailbox
  • Submit to CRAN

NULL at end of print sometimes

All columns were compared, all rows were compared 
All compared variables match 
 Number of rows compared: 243 
 Number of columns compared: 102NULL

Not sure how widespread this is. No key used in the example. Need to create a reproducible version of this error.

No longer on CRAN

The package has been removed from CRAN:

Archived on 2020-04-09 as check problems were not corrected despite reminders.

In addition to putting this back onto CRAN, I will also look into why I received no reminders about the check problems or any communication about this happening.

Make non-unique keys message more informative

If I run

rCompare(ddf, ddf2, keys = c("a", "b", "c", "d","e"))

but there are duplicates in c("a", "b", "c", "d","e") I get an error message like:

Running rCompare...
Error in matchSingleIndex(df_a, df_b, "dataCompareR_merged_indices", indices) : 
  The indices are not unique in the submitted dataframes. Please resubmit with unique indices.

I expect my indices are unique, so now I have to write some code to figure out why they're not!

I'm guessing the code knows more about what element is not unique, so it'd be nice if it told me!

dplyr warnings about deprecated code

These warnings don't show up every time ('once per session'). Running the tests with R 4.0.0 and dplyr 0.8.5 produced the following below, but not all the tests were run yet, so I will add more as they come up.

testCheckPrintObject.R:38: warning: print only generates message when data sets match
select_() is deprecated. 
Please use select() instead

The 'programming' vignette or the tidyeval book can help you
to program with select() : https://tidyeval.tidyverse.org
This warning is displayed once per session.

testCheckPrintObject.R:62: warning: print returns message and data when mismatches occur
arrange_() is deprecated. 
Please use arrange() instead

The 'programming' vignette or the tidyeval book can help you
to program with arrange() : https://tidyeval.tidyverse.org
This warning is displayed once per session.

testCheckPrintObject.R:62: warning: print returns message and data when mismatches occur
funs() is soft deprecated as of dplyr 0.8.0
Please use a list of either functions or lambdas: 

  # Simple named list: 
  list(mean = mean, median = median)

  # Auto named with `tibble::lst()`: 
  tibble::lst(mean, median)

  # Using lambdas
  list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once per session.

CVE-2020-11022 (Medium) detected in jquery-3.4.1.min.js

CVE-2020-11022 - Medium Severity Vulnerability

Vulnerable Library - jquery-3.4.1.min.js

JavaScript library for DOM operations

Library home page: https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js

Path to dependency file: dataCompareR/reference/createCleaningInfo.html

Path to vulnerable library: dataCompareR/reference/createCleaningInfo.html

Dependency Hierarchy:

  • jquery-3.4.1.min.js (Vulnerable Library)

Found in HEAD commit: 567a64e178266fdcb9b927190a300696c2430033

Vulnerability Details

In jQuery versions greater than or equal to 1.2 and before 3.5.0, passing HTML from untrusted sources - even after sanitizing it - to one of jQuery's DOM manipulation methods (i.e. .html(), .append(), and others) may execute untrusted code. This problem is patched in jQuery 3.5.0.

Publish Date: 2020-04-29

URL: CVE-2020-11022

CVSS 3 Score Details (6.1)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: Required
    • Scope: Changed
  • Impact Metrics:
    • Confidentiality Impact: Low
    • Integrity Impact: Low
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: https://blog.jquery.com/2020/04/10/jquery-3-5-0-released/

Release Date: 2020-04-29

Fix Resolution: jQuery - 3.5.0


Step up your Open Source Security Game with WhiteSource here

Error when Invalid column name included in the keys

If an invalid column name is included in the keys, the invalid column name fixes in the data frames creates an issue when the checks are done to see if the keys exist.

A user might not be able to tell what the issue is, as in their version of the data frames, the keys are present, and it's not obvious what the changed column names are after a fix.

rd warnings on install on some platforms

They're back!

createCompareObject                     html  
    createMeta                              html  
Rd warning: /tmp/RtmpyWaVpH/R.INSTALL63e85f08fea5/dataCompareR/man/createMeta.Rd:22: missing file link ‘round’
    createMismatchObject                    html  
    createMismatches                        html  
    createReportText                        html  
    createRowMatching                       html  
    createTextSummary                       html  
    currentObjVersion                       html  
    executeCoercions                        html  
    generateMismatchData                    html  
    getCoercions                            html  
    getMismatchColNames                     html  
    is.dataCompareRobject                   html  
    isNotNull                               html  
    isSingleNA                              html  
    listObsNotVerbose                       html  
    listObsVerbose                          html  
    locateMismatches                        html  
    makeValidKeys                           html  
    makeValidNames                          html  
    matchColumns                            html  
    matchMultiIndex                         html  
    matchNoIndex                            html  
    matchRows                               html  
    matchSingleIndex                        html  
    metaDataInfo                            html  
    mismatchHighStop                        html  
    orderColumns                            html  
    outputSectionHeader                     html  
    prepareData                             html  
    print.dataCompareRobject                html  
    print.summary.dataCompareRobject        html  
    processFlow                             html  
Rd warning: /tmp/RtmpyWaVpH/R.INSTALL63e85f08fea5/dataCompareR/man/processFlow.Rd:15: missing file link ‘round’
    rCompare                                html  
Rd warning: /tmp/RtmpyWaVpH/R.INSTALL63e85f08fea5/dataCompareR/man/rCompare.Rd:19: missing file link ‘round’
    rcompObjItemLength                      html  
    rounddf                                 html  
    saveReport                              html  
    subsetDataColumns                       html  
    summary.dataCompareRobject              html  
    trimCharVars                            html  
    updateCompareObject                     html  
    updateCompareObject.cleaninginfo        html  
    updateCompareObject.colmatching         html  
    updateCompareObject.matches             html  
    updateCompareObject.meta                html  
    updateCompareObject.mismatches          html  
    updateCompareObject.rowmatching         html  
    validateArguments                       html  
Rd warning: /tmp/RtmpyWaVpH/R.INSTALL63e85f08fea5/dataCompareR/man/validateArguments.Rd:14: missing file link ‘round’
    validateData                            html  
    variableDetails                         html  
    variableMismatches                      html  
    warnLargeData                           html  

Error with large datasets

The line

totalSize <- nrow(coercedData[[1]])*ncol(coercedData[[1]]) + nrow(coercedData[[2]])*ncol(coercedData[[2]])

aims to calculate the total number of elements for comparison. However, if this results in a value outside the range of a 32-bit integer, the code errors with

Running rCompare...
Coercing input data to data.frame
Error in if (totalSize > 2e+07) { : missing value where TRUE/FALSE needed
In addition: Warning message:
In nrow(coercedData[[1]]) * ncol(coercedData[[1]]) + nrow(coercedData[[2]]) *  :
  NAs produced by integer overflow

This code is just there to warn the user about a possible long run time, so it'd be preferable to remove this rather than cause this error, although I imagine there are probably numerous ways to fix it.

(Technical Debt) Dependencies on references to internal column/list item names

(Ported from another server)

This is mostly just a small technical debt issue.

To give an example of the dependencies I mean, the createColMatching function has hardcoded references to the names of the columns in the matchColumns output that have the name of the columns and the flag for whether it's in A or B. This means we'd have to change it in multiple places if we changed it in the actual matchColumns location, and this is not ideal, nor obvious. It would be better if we could find a way to reduce those kinds of dependencies somehow (passing references or keeping that info in some type of central location, etc.)

print() produces errors with one specific data set

Which I've attached for ease of recreation

ID,Col1,Col2,Col3
1,A,apple,0.8414710
2,B,orange,0.9092974
3,C,apple,0.1411200
4,D,pineapple,-0.7568025
5,E,apple,-0.9589243
6,F,orange,-0.2794155
ID,Col1,Col2,Col4,Col5
1,A,Apple,0.6666666,1
2,b,orange,0.9092974,2
3.0,D,apple,0.14,3
4,D,     pineapple,-0.7568025,4
5,E,apple,0.9589243,5
7,A,pink,4.1213000,6
> rCompare(a, b)
Running rCompare...
3 column(s) were dropped, all rows were compared 
There are  3 mismatched variables:
First and last 5 observations for the  3 mismatched variables
   rowNo    valueA         valueB variable   typeA  typeB diffAB
1      2         B              b     COL1  factor factor       
2      3         C              D     COL1  factor factor       
3      6         F              A     COL1  factor factor       
4      1     apple          Apple     COL2  factor factor       
5      4 pineapple      pineapple     COL2  factor factor       
6      6    orange           pink     COL2  factor factor       
7      1      <NA>           <NA>       ID integer double       
8      2      <NA>           <NA>       ID integer double       
9      3      <NA>           <NA>       ID integer double       
10     4      <NA>           <NA>       ID integer double       
11     5      <NA>           <NA>       ID integer double       
12     6      <NA>           <NA>       ID integer double       
Warning messages:
1: In `[<-.factor`(`*tmp*`, ri, value = 1:6) :
  invalid factor level, NA generated
2: In `[<-.factor`(`*tmp*`, ri, value = c(1, 2, 3, 4, 5, 7)) :
  invalid factor level, NA generated

CVE-2019-8331 (Medium) detected in bootstrap-3.3.5.min.js, bootstrap-3.3.5.js

CVE-2019-8331 - Medium Severity Vulnerability

Vulnerable Libraries - bootstrap-3.3.5.min.js, bootstrap-3.3.5.js

bootstrap-3.3.5.min.js

The most popular front-end framework for developing responsive, mobile first projects on the web.

Library home page: https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.min.js

Path to vulnerable library: /packrat/lib/x86_64-pc-linux-gnu/3.4.4/rmarkdown/rmd/h/bootstrap/js/bootstrap.min.js

Dependency Hierarchy:

  • bootstrap-3.3.5.min.js (Vulnerable Library)
bootstrap-3.3.5.js

The most popular front-end framework for developing responsive, mobile first projects on the web.

Library home page: https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.5/js/bootstrap.js

Path to vulnerable library: /packrat/lib/x86_64-pc-linux-gnu/3.4.4/rmarkdown/rmd/h/bootstrap/js/bootstrap.js

Dependency Hierarchy:

  • bootstrap-3.3.5.js (Vulnerable Library)

Found in HEAD commit: 567a64e178266fdcb9b927190a300696c2430033

Vulnerability Details

In Bootstrap before 3.4.1 and 4.3.x before 4.3.1, XSS is possible in the tooltip or popover data-template attribute.

Publish Date: 2019-02-20

URL: CVE-2019-8331

CVSS 3 Score Details (6.1)

Base Score Metrics:

  • Exploitability Metrics:
    • Attack Vector: Network
    • Attack Complexity: Low
    • Privileges Required: None
    • User Interaction: Required
    • Scope: Changed
  • Impact Metrics:
    • Confidentiality Impact: Low
    • Integrity Impact: Low
    • Availability Impact: None

For more information on CVSS3 Scores, click here.

Suggested Fix

Type: Upgrade version

Origin: twbs/bootstrap#28236

Release Date: 2019-02-20

Fix Resolution: bootstrap - 3.4.1,4.3.1;bootstrap-sass - 3.4.1,4.3.1


Step up your Open Source Security Game with WhiteSource here

Round has unexpected behaviour

A user is reporting odd behaviour when using the roundDigits functionality. Specifically, the # mismatches is increasing as roundDigits increases. See below for an example.

  8 digits 7 digits difference  
Column_1 2855164 286209 2568955 Expected
Column_2 1338229 1541336 -203107 Not Expected
Column_3 1716294 1302222 414072 Expected
Column_4 1127592 1730836 -603244 Not Expected

Installation Instructions

Tested the Installation Instructions on Linux. Package successfully installed in Ubuntu 16.04.2 with R 3.4.0 and the latest CRAN versions of dependent packages.

One minor issue are the installation instructions. Currently they are

library(devtools)
install_git('https://github.com/capitalone/dataCompareR.git', branch = 'master',
            subdir = 'dataCompareR', type = 'source', repos = NULL)

Unfortunately because of the defaults of install {devtools} are build_vignettes = False the option to build a vignette should be passed during installation. Hence they should be

library(devtools)
install_git('https://github.com/capitalone/dataCompareR.git', branch = 'master',
            subdir = 'dataCompareR', type = 'source', repos = NULL,
            build_vignettes = TRUE)

Otherwise vignette('dataCompareR') fails after installation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.