Transforms: log, square root, power. Add a pop-up that suggests/explains different typ

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add functions to transform datasets about agroft HOT 23 CLOSED

ucd-ipo commented on September 27, 2024

Add functions to transform datasets

from agroft.

Comments (23)

moorepants commented on September 27, 2024

@msimmond @lmpincus

For a given general multiple variate linear model with interactions, e.g. y ~ a1 + a2 + a1 * a2 do you want all variables transformed or to allow the user to select which variables to transform?

For example, do we only allow:

y ~ a1^2 + a2^2 + a1^2 * a2^2

or to allow this:

y ~ a1^2 + a2 + a1^2 * a2?

from agroft.

rkingdc commented on September 27, 2024

If I remember correctly, if you are including quadratic effects in the model, you'll also need to include all non-quadratic effects as well as terms.

y ~ x1 + I(x1^2) + x2 + I(x2^2) + ...

Also, best practice to surround the transformations with the isolate function I().

from agroft.

moorepants commented on September 27, 2024

Thanks, Kyle, But my question was more about letting users set individual variables to transform or simply transforming all of the variables. I don't know what is common or necessary for agricultural data.

from agroft.

rkingdc commented on September 27, 2024

Yeah, I know. I just added my two cents because a statistician had a stern
talking to with me about that once.

On Fri, Jul 17, 2015 at 8:04 PM, Jason K. Moore [email protected]
wrote:

Thanks, Kyle, But my question was more about letting users setting
individual variables to transform or simply transforming all of the
variables. I don't know what is common or necessary for agricultural data.

—
Reply to this email directly or view it on GitHub
#13 (comment)
.

from agroft.

moorepants commented on September 27, 2024

I'll definitely use your two cents :)

I'm neither a statistician nor an expert at R, so any tips help.

from agroft.

moorepants commented on September 27, 2024

@msimmond Just a reminder here, that I need some sort of example of what you want to do with transformations.

from agroft.

moorepants commented on September 27, 2024

@msimmond This is still cloudy to me. Now that we are going to the 9 scenarios, where do transformations fit in? Do we need more scenarios or do you want this to work in a more general way, i.e. can be applied to any of the 9 scenarios?

from agroft.

msimmond commented on September 27, 2024

I want the transformation options to be applied to any of the 9 scenarios. It will need to involve the user in the decision process of which transformation (if any) will be applied to data. The decision process will involve inspection of (1) shapiro-wilk test for normality of residuals, (2) levene's test for homogeneity of variance (HOV), (3) 1-df Tukey test for non-additivity (although, #3 is NOT run for CRD), (4) 'res x pred' plot, and (5) table of LS Means, SD, & Variances.

NOTES:
(2) HOV is the most important assumption (confirmed by non-significant Levene's test). Thus, no conclusions should be drawn from the ANOVA if this assumption is violated.

If Levene’s test is significant and transformation of the dependent variable does not correct it, a variance-weighted ANOVA (e.g. Welch's) can be performed to test for differences among group means.

Here are some references: http://www.unh.edu/halelab/BIOL933/labs/lab6.pdf, http://www.unh.edu/halelab/BIOL933/lectures/lect_11_reading.pdf

from agroft.

moorepants commented on September 27, 2024

Ok, if we want to apply transformations to any of the 9 scenarios we'll need a few examples (at least) for me to get an idea of how to generalize it in the app. Maybe once you build the model formula, you can select the transformation. Will this transformation only ever happen on the dependent variable?

from agroft.

msimmond commented on September 27, 2024

After the model formula is selected, and the lm(), anova(), shapiro.test(), leveneTest(), and the Tukey 1-DF Test on the squared residuals, and plot(resids ~ preds) are run, the user should be prompted if a transformation of the dependent variable (only case) is needed. If so, user proceeds to select transformation, and the previous code is run again on the transformed variable. At this point, they can check the output for improvement. There should be an option to try a different transformation, as well as to use Welch's variance-weighted ANOVA in the case that Levene's test cannot be corrected with transformations. I think the Ad-Hoc analyses of LSD should be on another tab to make it obvious that this is a next step once the model and data are all good.

from agroft.

moorepants commented on September 27, 2024

The transformation selection could be before the calls to lm/aov, shapiro.test, levenTest, Tukey, etc. If it is, then we don't have to make the code run through a second time, the user would do it manually. So, for example, the user would first use the default transformation (None) and they'd see all the output to the above functions and notice that they don't look good. So then they'd go back to the transformation input box and change the transformation. They'd have to then press "run analysis" again to see the new results of all the function calls. If we do it this way, then it is on the user to rerun the model with the transformation instead of having code that reruns it automatically when the user selects a transformation at the end of the analysis.

from agroft.

moorepants commented on September 27, 2024

Welch's variance-weighted ANOVA

This is something new, and outside of the scope of what we've already agreed on doing. Maybe put this as "icing on the cake".

I think the Ad-Hoc analyses of LSD should be on another tab to make it obvious that this is a next step once the model and data are all good.

Sounds good.

from agroft.

msimmond commented on September 27, 2024

OK, icing, but this is a common thing you have to do with ag data. I can look into the code for it.

from agroft.

msimmond commented on September 27, 2024

Also we'll need an option to detransform data in case user wants this in the post-hoc plots (e.g., bar graph showing de-transformed treatment means).

from agroft.

moorepants commented on September 27, 2024

So if you transform the data to create the model you may need to detransform it to do the post hoc tests?

Show me examples of the Welch's thing and some transformations and I can then see how complex it will be to fit it all in to the app.

And just to be a curious devil's advocate here...why not just teach the workshop attendees how to do this stuff in R instead of wrapping this app around limited functionality?

from agroft.

moorepants commented on September 27, 2024

For reference this is Maegan's sample code for finding an exponent for a power transformation (this works with the RCBD two var example):

my.data$merged_treatment <- paste0(my.data$clone, my.data$nitrogen)
as.factor(my.data$merged_treatment)
str(my.data)
means <- aggregate(my.data$yield, list(my.data$merged_treatment), mean)
vars <- aggregate(my.data$yield, list(my.data$merged_treatment), var)
logmeans <- log10(means$x)
logvars <- log10(vars$x)
power.mod<-lm(logvars ~ logmeans)
summary<-summary(power.mod)
#identify the slope
summary$coefficients[2,1]
#calculate the appropriate power of the transformation, where Power = 1 – (slope/2)
power <- 1-(summary$coefficients[2,1])/2
power
#Create power-tranformed variable
my.data$yield<-(my.data$yield)^(power)

from agroft.

moorepants commented on September 27, 2024

Transformations are now supported in the app. Closing.

from agroft.

msimmond commented on September 27, 2024

Has the transformation back to the original units been added yet?

For log-transformed variable X ==> 10^X
For power-transformed variable X with exponent 'a' ==> X^-a

On Thu, Aug 13, 2015 at 6:14 PM, Jason K. Moore [email protected]
wrote:

Closed #13 #13.

—
Reply to this email directly or view it on GitHub
#13 (comment).

Maegen Simmonds, Ph.D.
[email protected]
707-694-6079

from agroft.

moorepants commented on September 27, 2024

No, that is listed in a different issue #41 and I don't really understand what you want here. There is nothing to change back. There are columns in the data frame for the original variable and the transformed variables. I don't change the original column, I just add new columns.

from agroft.

msimmond commented on September 27, 2024

We just need the LS means for the treatments (from the LSD table) to be
transformed back to the original units. These values should be made visible
to the users and also the means that will be used in the bar plot. Does
that make sense?

On Thu, Aug 13, 2015 at 9:51 PM, Jason K. Moore [email protected]
wrote:

No, that is listed in a different issue #41
#41 and I don't really
understand what you want here. There is nothing to change back. There are
columns in the data frame for the original variable and the transformed
variables. I don't change the original column, I just add new columns.

—
Reply to this email directly or view it on GitHub
#13 (comment)
.

Maegen Simmonds, Ph.D.
[email protected]
707-694-6079

from agroft.

moorepants commented on September 27, 2024

Adding what you want to the examples we've been working on would make the clearest sense. If you can write the example code, I'll know exactly what to do.

from agroft.

moorepants commented on September 27, 2024

Also try out the app and let me know if it is doing transformations correctly. In both of the examples you added transformations to, I don't think they were correct because you transformed the same variables in this fashion: sqrt(log10(y))^power.

from agroft.

msimmond commented on September 27, 2024

I started adding it to #34, but need your help.

from agroft.

Add functions to transform datasets about agroft HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent