Code Monkey home page Code Monkey logo

poppr's Issues

Fix definition of Hexp

Currently, Hexp, produced by the function poppr is cited as Nei's expected heterozygosity (Nei 1978). Equation 2 in this paper translates to the implementation here if you consider x to be a genotype instead of an allele. How this came to be is a mystery. This is actually an unbiased measure of genotypic diversity representing the probability of drawing two genotypes from a population and having them be different. It's implementation in the function locus_table is accurate.

Tasks:

  • Change citation
  • Add average expected heterozygosity (equation 3 in Nei 1978) to poppr output
  • Fix equation in "algo" reference manual.

poppr breaking with new adegenet on CRAN

CRAN's maintainer has reported the following issue caused by the new version of adegenet on CRAN:

** preparing package for lazy loading
Error in setIs(Class, class2, classDef = classDef, where = where) :
  class ā€œgencloneā€ cannot extend class ā€œgenindā€: slot in class ā€œgencloneā€ must extend corresponding slot in class ā€œgenindā€: fails for hierarchy
Error in setClass("genclone", contains = "genind", representation = representation(mlg = "numeric",  :
  error in contained classes ("genind") for class ā€œgencloneā€; class definition removed from ā€˜popprā€™
Error : unable to load R code in package ā€˜popprā€™
ERROR: lazy loading failed for package ā€˜popprā€™

use stored function/object name as a memory for mlg.filter

Currently, mlg.filter has a default behavior of using nei.dist to calculate the distance matrix. If the user utilizes mlg.filter as an assignment method, currently @distname slot in the object carries the name of the object or the function and the @distargs slot contains the arguments for that function.

In this way there are a couple of procedures that could happen for this.

  1. check if @distname is a function or a distance matrix:
is.function(eval(x@mlg@distname))
is.dist(eval(x@mlg@distname))
  1. if @distname is a function, use do.call
do.call(eval(x@mlg@distname), c(x, x@mlg@distargs))

Add support for custom MLG definitions

Create a slot that allows users to customize MLG definitions such that they would be able to increase or decrease MLG diversity based on their knowledge of the system.

Include tests to be skipped on CRAN

The package testthat has a function called skip_on_cran() that will skip tests on CRAN to save run time. This will allow the construction of a bare set of tests to make sure the package works and an expanded set to increase code coverage to test for caveats.

implement pgen and psex

The current implementation is internal and slightly broken. It would be good to implement this even if it's only in R.

This will be closed when the following tasks are complete:

  • function to estimate round-robin allele frequencies is in place
  • pgen works for genind objects
  • code is documented
  • tests are written
  • Stacy Krueger-Hadfield and Erik Sotka are added as contributors on the functions and DESCRIPTION
  • add NEWS
  • add functions to package man page
  • add functions to main vignette

add skip_on_cran() to test-filter.R

I got a lovely email from BDR today:

On Jul 18, 2015, at 03:45 , Prof Brian Ripley [email protected] wrote:

See https://cran.r-project.org/web/checks/check_results_poppr.html . There are several separate errors, but for fedora-gcc it seems you are testing floating-point equality which the manual warns you not to do.

Please fix ASAP.

Brian D. Ripley, [email protected]
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK

Because I'm already testing on coveralls, I think I can safely add skip_on_cran() to that particular test. Since I will also be uploading the version that fixes #47, I will increase the version number hopefully the last time.

add backwards compatibility for hierarchy methods in version 2.0

Since the hierarchy methods were moved to adegenet 2.0, all of them have changed names slightly to account for the preferred style (camelCase in adegenet). To make the transition smoother, the hierarchy functions from poppr should be kept, but should only serve as wrappers to the methods in adegenet (note that both the assignment and functional methods need to have replacers):

poppr version adegenet version
sethierarchy strata
gethierarchy strata
splithierarchy splitStrata
addhierarchy addStrata
namehierarchy nameStrata

Integrate bootstrap confidence intervals for diversity stats

Introduction

The poppr command returns a table that contains diversity summary statistics and linkage summary statistics. Currently, the only summary statistics that give a level of confidence are rarefaction (eMLG) and linkage (Ia and rbarD). It only makes sense to provide confidence intervals or standard errors for the other diversity statistics present in the diversity table.

Current implementation

A working example can be found in this bit of code. It performs bootstrapping using the boot package and returns confidence intervals. Since it uses the boot package, it has inherent multicore functionality.

Tasks

  • integrate bootstrapping for summary statistics
  • allow users to select confidence intervals or standard errors
  • add option to keep or remove analyses from the table
  • document function

Explicitly document model choice options in Bruvo's distance

Bruvo's distance currently has the options to turn on and off the genome addition and genome loss models, but does not explicitly state what happens when both are on and both are off. This should be documented in all bruvo functions as well as the Algorithms and Equations vignette.

Single locus genalex files cannot be imported

This post in the poppr forum showed that a single locus data set could not be imported. Here's a reproducible example.

# Note, this is a subset of the monpop data set, which is haploid. I'm treating it as a diploid here
zz <- "1    6   1   6
7_09_BB         
Ind Pop CHMFc4  CHMFc5
A004    7_09_BB 224 85
A002    7_09_BB 224 97
A011    7_09_BB 224 97
A009    7_09_BB 224 97
A006    7_09_BB 224 97
A013    7_09_BB 224 97"

read.genalex(textConnection(z), sep = "\t")
 Error in gena2[, ((x - 1)/ploidy) + 1] <<- pop_combiner(gena, hier = x:(x +  : 
  incorrect number of subscripts on matrix 

The reason for this error was because of our old friend drop = TRUE being default for subsetting matrices. I am scouring the code and setting drop = FALSE everywhere.

Add 2.0-rc tag after merge

This will allow compatibility for the link in the frontiers paper.

  • merge 2.0-rc into master and devel
  • remove 2.0-rc branch
  • change README and NEWS
  • tag as 2.0-rc so that link in frontiers paper still works and make pre-release.
  • manually change version and submit to CRAN
  • tag version once it's accepted on CRAN.

add function to recalcuate MLGs

If the user has a genclone object, the MLGs are static from when the data is read in to allow for identification downstream. There is no obvious way to reset those MLGs currently. It would be good to provide a small function that would do this.

Allow text connections for read.genalex

read.genalex currently reads in a file from disc formatted as a genalex csv. Unfortunately, it does not work well on connection objects. To alleviate this, the argument skip = 2 on this line in read.genalex should be changed to skip = ifelse("connection" %in% class(genalex), 0, 2), which would allow connections to be read in thusly:

library(poppr)
data(partial_clone)
pc <- file()
genind2genalex(partial_clone, filename = pc)
partial_clone2 <- read.genalex(pc)
close(pc)

This would make it easier to create reproducible examples.

Memory leak with valgrind

I haven't gotten an email yet, but there's a memtest note on CRAN. It's coming from bitwise.c, which makes me think it might be a threadding issue :(

==21510== Memcheck, a memory error detector
==21510== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21510== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==21510== Command: /data/blackswan/ripley/R/R-devel-vg/bin/exec/R -f test-all.R --restore --save --no-readline --vanilla
==21510== 

R Under development (unstable) (2015-07-03 r68625) -- "Unsuffered Consequences"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(testthat)
> test_check("poppr")
Loading required package: poppr
Loading required package: adegenet
Loading required package: ade4

   /// adegenet 2.0.0 is loaded ////////////

   > overview: '?adegenet'
   > tutorials/doc/questions: 'adegenetWeb()' 
   > bug reports/feature resquests: adegenetIssues()


This is poppr version 2.0.0. To get started, type package?poppr
==21510== Invalid read of size 4
==21510==    at 0x1BD52E0F: bitwise_distance_haploid._omp_fn.0 (packages/tests-vg/poppr/src/bitwise_distance.c:232)
==21510==    by 0x35D0809BBE: GOMP_parallel (in /usr/lib64/libgomp.so.1.0.0)
==21510==    by 0x1BD50F08: bitwise_distance_haploid (packages/tests-vg/poppr/src/bitwise_distance.c:175)
==21510==    by 0x479A92: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==21510==    by 0x4B1C3E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==21510==    by 0x4B4C35: do_set (svn/R-devel/src/main/eval.c:2106)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==  Address 0x1ce20844 is 772 bytes inside a block of size 1,968 alloc'd
==21510==    at 0x4A06BCF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x4DE26D: GetNewPage (svn/R-devel/src/main/memory.c:865)
==21510==    by 0x4DFD71: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2437)
==21510==    by 0x541032: Rf_allocVector (svn/R-devel/src/include/Rinlinedfuns.h:194)
==21510==    by 0x541032: HashLookup (svn/R-devel/src/main/unique.c:801)
==21510==    by 0x542C10: match5 (svn/R-devel/src/main/unique.c:902)
==21510==    by 0x4A3DFA: bcEval (svn/R-devel/src/main/eval.c:5524)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510== 
==21510== Thread 2:
==21510== Invalid read of size 4
==21510==    at 0x1BD52EE0: bitwise_distance_haploid._omp_fn.0 (packages/tests-vg/poppr/src/bitwise_distance.c:248)
==21510==    by 0x35D080DB65: ??? (in /usr/lib64/libgomp.so.1.0.0)
==21510==    by 0x35B5607529: start_thread (in /usr/lib64/libpthread-2.20.so)
==21510==    by 0x35B4F0022C: clone (in /usr/lib64/libc-2.20.so)
==21510==  Address 0x1ce20844 is 772 bytes inside a block of size 1,968 alloc'd
==21510==    at 0x4A06BCF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x4DE26D: GetNewPage (svn/R-devel/src/main/memory.c:865)
==21510==    by 0x4DFD71: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2437)
==21510==    by 0x541032: Rf_allocVector (svn/R-devel/src/include/Rinlinedfuns.h:194)
==21510==    by 0x541032: HashLookup (svn/R-devel/src/main/unique.c:801)
==21510==    by 0x542C10: match5 (svn/R-devel/src/main/unique.c:902)
==21510==    by 0x4A3DFA: bcEval (svn/R-devel/src/main/eval.c:5524)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510== 
==21510== Thread 1:
==21510== Invalid read of size 4
==21510==    at 0x1BD5331F: bitwise_distance_diploid._omp_fn.1 (packages/tests-vg/poppr/src/bitwise_distance.c:439)
==21510==    by 0x35D0809BBE: GOMP_parallel (in /usr/lib64/libgomp.so.1.0.0)
==21510==    by 0x1BD5125F: bitwise_distance_diploid (packages/tests-vg/poppr/src/bitwise_distance.c:374)
==21510==    by 0x479A92: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==21510==    by 0x4B1C3E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==21510==    by 0x4B4C35: do_set (svn/R-devel/src/main/eval.c:2106)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==  Address 0x1d27d254 is 724 bytes inside a block of size 1,968 alloc'd
==21510==    at 0x4A06BCF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x4DE26D: GetNewPage (svn/R-devel/src/main/memory.c:865)
==21510==    by 0x4DFD71: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2437)
==21510==    by 0x446671: Rf_allocVector (svn/R-devel/src/include/Rinlinedfuns.h:194)
==21510==    by 0x446671: do_nzchar (svn/R-devel/src/main/character.c:123)
==21510==    by 0x4A3DFA: bcEval (svn/R-devel/src/main/eval.c:5524)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510== 
==21510== Thread 2:
==21510== Invalid read of size 4
==21510==    at 0x1BD533F0: bitwise_distance_diploid._omp_fn.1 (packages/tests-vg/poppr/src/bitwise_distance.c:455)
==21510==    by 0x35D080DB65: ??? (in /usr/lib64/libgomp.so.1.0.0)
==21510==    by 0x35B5607529: start_thread (in /usr/lib64/libpthread-2.20.so)
==21510==    by 0x35B4F0022C: clone (in /usr/lib64/libc-2.20.so)
==21510==  Address 0x1d27d254 is 724 bytes inside a block of size 1,968 alloc'd
==21510==    at 0x4A06BCF: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x4DE26D: GetNewPage (svn/R-devel/src/main/memory.c:865)
==21510==    by 0x4DFD71: Rf_allocVector3 (svn/R-devel/src/main/memory.c:2437)
==21510==    by 0x446671: Rf_allocVector (svn/R-devel/src/include/Rinlinedfuns.h:194)
==21510==    by 0x446671: do_nzchar (svn/R-devel/src/main/character.c:123)
==21510==    by 0x4A3DFA: bcEval (svn/R-devel/src/main/eval.c:5524)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4AA831: bcEval (svn/R-devel/src/main/eval.c:5496)
==21510==    by 0x4B178F: Rf_eval (svn/R-devel/src/main/eval.c:558)
==21510== 
Loading required package: parallel
testthat results ================================================================
OK: 221 SKIPPED: 52 FAILED: 0
> 
> proc.time()
   user  system elapsed 
385.387  78.932 391.750 
==21510== 
==21510== HEAP SUMMARY:
==21510==     in use at exit: 237,692,259 bytes in 121,847 blocks
==21510==   total heap usage: 814,885 allocs, 693,038 frees, 1,242,042,234 bytes allocated
==21510== 
==21510== Thread 1:
==21510== 18,352 bytes in 31 blocks are possibly lost in loss record 4,689 of 5,786
==21510==    at 0x4A08946: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x35B46129E4: _dl_allocate_tls (in /usr/lib64/ld-2.20.so)
==21510==    by 0x35B5607EC7: pthread_create@@GLIBC_2.2.5 (in /usr/lib64/libpthread-2.20.so)
==21510==    by 0x35D080E0B6: ??? (in /usr/lib64/libgomp.so.1.0.0)
==21510==    by 0x35D0809BB9: GOMP_parallel (in /usr/lib64/libgomp.so.1.0.0)
==21510==    by 0x1BD50F08: bitwise_distance_haploid (packages/tests-vg/poppr/src/bitwise_distance.c:175)
==21510==    by 0x479A92: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==21510==    by 0x4B1C3E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==21510==    by 0x4B4C35: do_set (svn/R-devel/src/main/eval.c:2106)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510== 
==21510== 60,656 bytes in 3,791 blocks are definitely lost in loss record 5,303 of 5,786
==21510==    at 0x4A08946: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x4D9140: R_chk_calloc (svn/R-devel/src/main/memory.c:3186)
==21510==    by 0x1BD55344: bruvo_dist (packages/tests-vg/poppr/src/poppr_distance.c:425)
==21510==    by 0x1BD56050: bruvo_distance (packages/tests-vg/poppr/src/poppr_distance.c:296)
==21510==    by 0x479A92: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==21510==    by 0x4B1C3E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==21510==    by 0x4B4C35: do_set (svn/R-devel/src/main/eval.c:2106)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4B18E8: Rf_eval (svn/R-devel/src/main/eval.c:674)
==21510== 
==21510== 68,128 (34,064 direct, 34,064 indirect) bytes in 2,129 blocks are definitely lost in loss record 5,347 of 5,786
==21510==    at 0x4A08946: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==21510==    by 0x4D9140: R_chk_calloc (svn/R-devel/src/main/memory.c:3186)
==21510==    by 0x1BD55356: bruvo_dist (packages/tests-vg/poppr/src/poppr_distance.c:432)
==21510==    by 0x1BD56050: bruvo_distance (packages/tests-vg/poppr/src/poppr_distance.c:296)
==21510==    by 0x479A92: do_dotcall (svn/R-devel/src/main/dotcode.c:1251)
==21510==    by 0x4B1C3E: Rf_eval (svn/R-devel/src/main/eval.c:655)
==21510==    by 0x4B4C35: do_set (svn/R-devel/src/main/eval.c:2106)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B36C8: do_begin (svn/R-devel/src/main/eval.c:1716)
==21510==    by 0x4B1AB0: Rf_eval (svn/R-devel/src/main/eval.c:627)
==21510==    by 0x4B6DBC: Rf_applyClosure (svn/R-devel/src/main/eval.c:1039)
==21510==    by 0x4B18E8: Rf_eval (svn/R-devel/src/main/eval.c:674)
==21510== 
==21510== LEAK SUMMARY:
==21510==    definitely lost: 94,720 bytes in 5,920 blocks
==21510==    indirectly lost: 34,064 bytes in 4,258 blocks
==21510==      possibly lost: 18,352 bytes in 31 blocks
==21510==    still reachable: 237,545,123 bytes in 111,638 blocks
==21510==         suppressed: 0 bytes in 0 blocks
==21510== Reachable blocks (those to which a pointer was found) are not shown.
==21510== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==21510== 
==21510== For counts of detected and suppressed errors, rerun with: -v
==21510== ERROR SUMMARY: 11 errors from 7 contexts (suppressed: 0 from 0)

Pairwise AMOVA

GenAlex generates pairwise PhiPT values for any AMOVA analysis run with more than one between group comparison. We have many between group comparisons to make. Is there an easy way using Poppr to generate something similar to what GenAlex produces? Please add pairwise_amova

Thanks!
S

Text encoding errors

Several files have UTF-8 em-dashes and directional quotes.

From Dr. Ripley:

There are also issues when checking in a Latin-1 locale on Linux:

  • checking whether package 'poppr' can be installed ... [10s/11s] WARNING
    Found the following significant warnings:
    Warning: unable to re-encode 'amova.r' line 175
    Warning: unable to re-encode 'distances.r' lines 158, 161, 167, 171, 178, 181, 187, 191, 197, 200
    Warning: unable to re-encode 'internal.r' lines 96, 111, 112
    Warning: unable to re-encode 'poppr.R' lines 65, 67, 68, 69, 70, 71, 76, 96

Implement index selection (filtering) in bitwise.IA

bitwise.IA accepts as input a vector of locus indices to be used in its calculations. Currently this vector is ignored. The final version should be able to compute the index of association over this limited set of loci.

[.table method in R-devel breaks poppr.amova

This is an issue that will not affect most of the user base at the moment, but will do so when R releases a new version:

From Kurt Hornik:

Dear maintainer,

Current r-devel has added a [.table method which by default preserves
the table class in "most" cases.

This breaks your package with

amova.result <- poppr.amova(agc, ~Pop/Subpop)

No missing values detected.
Warning in Ops.factor(left, right) : ā€˜<ā€™ not meaningful for factors
Warning in Ops.factor(left, right) : ā€˜<ā€™ not meaningful for factors
Error in if (any(samples < 0)) stop("Negative value in samples") : 
 missing value where TRUE/FALSE needed
Calls: poppr.amova -> <Anonymous>

I think the problem is that in mlg.table() you have

 mlgtab <- mlgtab[, which(colSums(mlgtab) > 0)]
 return(mlgtab)

where subscripting used to drop the table class, but no longer does, and
so

 xtab    <- t(mlg.table(x, bar = FALSE, quiet = TRUE, mlgsub = unique(mlg.vector(x))))
 xtab    <- as.data.frame(xtab)

ends up calling as.data.frame.table().

To fix, use e.g.

 xtab    <- as.data.frame(unclass(xtab))

I think.

Can you pls fix this as necessary, and upload a new version as quickly
as possible?

Alongside, pls also fix

The Title field should be in title case, current version then in title case:
ā€˜Genetic Analysis of Populations With Mixed Reproductionā€™
ā€˜Genetic Analysis of Populations with Mixed Reproductionā€™

and single quote 'adegenet' in the Description.

Best
-k

Allow poppr.amova to take subsetted genclone objects

Currently poppr.amova partitions the mlg table by numerical index at this line. In the genclone objects, unfortunately, this index doesn't work as some MLGs might be excluded. The error produced is this:

Error in xtab[unique(mlg.vector(x)), ] : subscript out of bounds

To get around this, mlg.table should be called with the flag mlgsub = unique(mlg.vector(x, quiet = TRUE)), where x is a genclone object. This will automatically sort the MLG table to prevent this line from being needed.

UBSAN shows a memory-access error

After submission of poppr version 1.1.0 to CRAN, a notification was sent from Dr. Ripley notifying me of an error in UBSAN while running tests:

Analytical value tests : poppr_distance.c:664:5: runtime error: index 1 out of bounds for type 'int [zerocatch[miss_ind]]'

This error stems from illegal memory allocation. I will sanitize the code and post a patch.

Hexp calcuation is erroneous

The problem:

After re-reading Nei (1978), I realized that my implementation of Hexp is:

N/(N - 1) * 1 - sum(p^2)

Where N is the number of allelic states and p is the vector of allele frequencies. Nei's definition is:

kn/(kn - 1) * 1 - sum(p^2)

Where n is the number of observed samples at a locus and k is the ploidy (to account for dosage).

User-facing impacts:

  • poppr()
  • locus_table()

What needs to be fixed:

  • internal function locus_table_pegas()

Impacts after fix:

  • Polyploids - Because the calculation is dependent on ploidy, this means that it would inappropriate to calculate this for polyploids due to ambiguous dosage.
  • locus_table(type = "genotype") - Hexp will change to unbiased simpson's index.

Unfortunately, I might have to wait until just before August to submit this patch, lest I anger our CRAN overlords.

Fix poppr for upcoming release of adegenet/pegas/hierfstat

The adegenet package is going through major changes on github and many functions in poppr will break due to these changes.

Three main things will change:

  1. the tab slot of genind objects will store data as integers as opposed to floats
  2. the ploidy slot will contain the ploidy for all samples
  3. genind and genlight objects will gain genclone's hierarchy slot

Changes that need to be implemented in poppr:

  • remove hierarchy slots from genclone objects as they will be inherited
  • remove hierarchy methods
  • change all references to $tab and @tab to tab().
  • change na.replace(object) in missingno to tab(object) and deprecate.
  • ensure that distance methods are compatible.
  • change SEXP pairdiffs to take integer input.
  • change SEXP permute_shuff to take integer input.
  • add tab method to bootgen objects.

This is not an exhaustive list as I know bugs will crop up as I encounter them, so I'm putting this here:

  • WRITE A TEST FOR EVERY BUG ENCOUNTERED

Allow named vector of repeat lengths for Bruvo's distance

Bruvo's distance requires knowledge of repeat lengths for calculations. The current method has the user supply a numeric vector with repeat lengths, but they must be in the exact order as the loci. A nice feature to have would be to have the option to submit named repeat lengths and match them to the loci.

Bug in permutation scheme with triploids

There is a bug in the default shuffling algorithm for shufflepop and ia that comes down to numerical precision. The general process is:

  1. Take the sum of the columns (alleles) at a locus and multiply them by the ploidy to get the counts of each allele.
  2. Create a vector of indices of alleles where each index is replicated the number of times that allele is represented.
  3. Use that to shuffle the locus.

The problem occurs when there are triploids and 1/3 by 3 doesn't always equal 1.

mat <- structure(c(0.333333333333333, 0.333333333333333, 0.333333333333333, 
0.333333333333333, 0, 0, 0.333333333333333, 0, 0.666666666666667, 
0.666666666666667, 0.333333333333333, 0.666666666666667), .Dim = c(4L, 
3L), .Dimnames = list(structure(c("1", "2", "3", "5"), .Names = c("1", 
"2", "3", "4")), c("D13.000", "D13.144", "D13.190")))

bucket <- colSums(mat) * 3
bucket
## D13.000 D13.144 D13.190
##       4       1       7

rep(1:length(bucket), bucket)
## [1]11112333333

print(bucket, dig = 20)
## D13.000 D13.144 D13.190
## 4.0000000000000000000 1.0000000000000000000 6.9999999999999991118

A control scheme to check for the correct number of alleles is needed.

User model choice being ignored in bruvo.msn

Current workaround solution: calculate bruvo's distance with bruvo.dist() and then feed it into poppr.msn()

Problem: all calls to bruvo.dist() from bruvo.msn() and poppr:::singlepop_msn() do not give the add or loss parameters, thus defaulting to average over genome loss and genome addition models, despite the user choice.

I will fix these in poppr's current devel branch and then make a pull request for @JonahBrooks so that he can incorporate it into his code.

Parallelize bitwise.IA

bitwise.IA was designed with multi-threading in mind, but the functionality hasn't been incorporated into the function yet. It needs to be added before this function can be considered complete.

Add names to colors in *.msn functions

Currently the colors of the pies will be represented by unnamed character vectors of hexadecimal colors. This vector of colors is used as a lookup table when the color palette needs to be changed, but the assumption is that all colors are different. In the case that not all colors are different, the names of the populations are important at differentiating the colors. If the vector resulting from the color palette is named with the population names, then the colors in the pie.colors will be named as well resulting in an easier lookup.

example here can add the additional line:

names(color) <- [email protected]

This should be changed for the functions bruvo.msn and poppr.msn and the functions below update_poppr_graph should also be updated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.