[vsni.co.uk] Contact us
Author Message
Post new topic

  ASReml  ~  Contrasts

jason
Posted: Wed Jul 08, 2009 6:27 am Reply with quote
Joined: 23 Sep 2008 Posts: 11
(1) Contrasts for interaction terms. In SAS PROC MIXED, I could use
Code:
"estimate 'B1 vs B2 in A1' B 1 -1 0 A*B 1 -1 0 0 0 0"
to conduct a contrast test for interaction terms. How can I do it in ASReml using !CONTRAST qualifier?

(2) How to do contrast tests in ASReml-R?

Thanks!
Jason
View user's profile Send private message
cullisb
Posted: Wed Jul 08, 2009 7:44 am Reply with quote
Guest
Dear jason
in answer to Q2

here is some simple R code to set up a factor with contrasts that you may wish to include in the linear mixed model via the fixed formula argument, I presume.

You can then interact the factor "fff" with any other terms in any way you wish

Note that if you dont specify how.many then R will create a full matrix with the extra column being orthogonal to the mean and the first column and normalised

Quote:
fff <- factor(c(1,2,3,1,2,3))
contrasts(fff) <- matrix(c(-1,1,0),3,1)
fff
[1] 1 2 3 1 2 3
attr(,"contrasts")
[,1] [,2]
1 -1 -0.4082483
2 1 -0.4082483
3 0 0.8164966
Levels: 1 2 3
Quote:
contrasts(fff,how.many=1) <- matrix(c(-1,1,0),3,1)
fff
[1] 1 2 3 1 2 3
attr(,"contrasts")
[,1]
1 -1
2 1
3 0
Levels: 1 2 3

HTH


warm regards

Brian Cullis
Research Leader, Biometrics &
Senior Principal Research Scientist
NSW Department of Primary Industries
Wagga Wagga Agricultural Institute

Visiting Professorial Fellow
School of Mathematics and Applied Statistics
Faculty of Informatics
University of Wollongong

Professor,
Faculty of Agriculture, Food & Natural Resources
The University of Sydney

Adjunct Professor
School of Computing and Mathematics
Charles Sturt University


Phone: 61 2 6938 1855
Fax: 61 2 6938 1809
Mobile: 0439 448 591






This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of their organisation.

Post generated using Mail2Forum (http://www.mail2forum.com)
Clempson, Andrew Martin
Posted: Wed Jul 08, 2009 10:18 am Reply with quote
Guest
Dear All

I wonder if someone could provide advice on dealing with missing values in binary data models.

I am in the process of performing association studies between SNPs and pregnancy status in cows. I have genotypic information on approximately 90% of animals and pregnancy status data on approximately 80% of animals (coded as 0 = pregnant, 1 = not pregnant).

At the moment, my model is very simple –

Fertility Analysis
ANIMALID !P
Sire !P
Dam !P
Herd !I
Age
PregnancyStatus
SNP1!I
Pedigree.ped !MAKE !SORT !REPEAT
Pregnancy.csv !CSV !MAXIT 100 !EXTRA 20 !MVINCLUDE !SKIP 1 !DISPLAY 15
PregnancyStatus !BIN ~ mu Herd Age SNP1 !r ANIMALID


The model only runs when the !MVINCLUDE qualifier is specified, however, I am trying to find a way of excluding missing values, as including them interferes with the result. Using MVEXCLUDE / MVREMOVE causes the model to fail.

I would be very grateful for any replies,
Many thanks

Andrew






This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of their organisation.

Post generated using Mail2Forum (http://www.mail2forum.com)
Arthur
Posted: Wed Jul 08, 2009 11:24 pm Reply with quote
Joined: 05 Aug 2008 Posts: 471 Location: Orange, NSW
Dear Andrew,

You can delete the records with missing SNP values with a transformation

In ASReml 3, you could use

SNP1 !DV*

Otherwise use
SNP1 !D 10

The latter also drops records with a value of 10 but I assume there are
none of those.


Also,
I expect SNP has just two states (0,1 or -1,1 probably). It would be
better
to drop the !I from its declaration and treat it as a covariable
so you are sure which way the effect is fitted.


Assuming you have in fact may SNPs to test,
there are ways using !CYCLE to test a whole lot in a single run.



On Wed, 2009-07-08 at 11:16 +0100, Clempson, Andrew Martin wrote:
Quote:
Dear All



I wonder if someone could provide advice on dealing with missing
values in binary data models.



I am in the process of performing association studies between SNPs and
pregnancy status in cows. I have genotypic information on
approximately 90% of animals and pregnancy status data on
approximately 80% of animals (coded as 0 = pregnant, 1 = not
pregnant).



At the moment, my model is very simple –



Fertility Analysis

ANIMALID !P

Sire !P

Dam !P

Herd !I

Age

PregnancyStatus

SNP1!I

Pedigree.ped !MAKE !SORT !REPEAT

Pregnancy.csv !CSV !MAXIT 100 !EXTRA 20 !MVINCLUDE !SKIP 1 !DISPLAY 15

PregnancyStatus !BIN ~ mu Herd Age SNP1 !r ANIMALID





The model only runs when the !MVINCLUDE qualifier is specified,
however, I am trying to find a way of excluding missing values, as
including them interferes with the result. Using MVEXCLUDE / MVREMOVE
causes the model to fail.



I would be very grateful for any replies,

Many thanks



Andrew










This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of their organisation.
--
Best Wishes,
Arthur Gilmour

Adjunct Professor
School of Computing and Mathematics
Charles Sturt University

Jesus went to the synagogue in every town in Galilee to preach and drive
out demons.
A leper came, knelt down and said, "If you will, you can make me clean".
Jesus was moved. He reached out and touched him and said
"I am willing. Be clean." He was cleansed immediately.

Mobile Number +61 427 227 468
Home phone +61 2 6364 3288 Skype: Arthur.Gilmour

http://www.CargoVale.com.au/ASReml

Travel:
Brazil Jul22 - Aug 8 BR Biometrics
UK Aug 10-14 ASReml workshop VSN
Mumbai Aug 16-17 ASReml workshop

Adelaide AAABG 27Sept to 2 Oct
Brisbane Probe 29Nov to 6 Dec
Bangladesh Prosihhkon 31Dec - 31 Jan

This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of their organisation.

Post generated using Mail2Forum (http://www.mail2forum.com)

_________________
Arthur Gilmour

Retired Principal Research Scientist (Biometrics)
View user's profile Send private message Send e-mail Visit poster's website
jason
Posted: Wed Jul 08, 2009 11:40 pm Reply with quote
Joined: 23 Sep 2008 Posts: 11
cullisb wrote:
Dear jason
in answer to Q2

here is some simple R code to set up a factor with contrasts that you may wish to include in the linear mixed model via the fixed formula argument, I presume.

You can then interact the factor "fff" with any other terms in any way you wish

Note that if you dont specify how.many then R will create a full matrix with the extra column being orthogonal to the mean and the first column and normalised

Quote:
fff <- factor(c(1,2,3,1,2,3))
contrasts(fff) <- matrix(c(-1,1,0),3,1)
fff
[1] 1 2 3 1 2 3
attr(,"contrasts")
[,1] [,2]
1 -1 -0.4082483
2 1 -0.4082483
3 0 0.8164966
Levels: 1 2 3
Quote:
contrasts(fff,how.many=1) <- matrix(c(-1,1,0),3,1)
fff
[1] 1 2 3 1 2 3
attr(,"contrasts")
[,1]
1 -1
2 1
3 0
Levels: 1 2 3

HTH


Thanks for your help, Brian.
I tried the approach you described, but it seems to me that there is no effect on ASReml-R. In the following simple example, "oats.1" and "oats.2" return the identical results. Did I do something wrong? By contrast, it worked with lm() - see the differences between "oats.x1" and "oats.x2".
Code:
anova(oats.1 <- asreml(fixed=yield ~ Variety, data=oats))
#               Df Sum of Sq Wald statistic Pr(Chisq)
# (Intercept)    1    778336           1070    <2e-16 ***
# Variety        2      1786              2    0.2930
# residual (MS)          728

anova(oats.x1 <- lm(yield ~ Variety, data=oats))
#           Df Sum Sq Mean Sq F value Pr(>F)
# Variety    2   1786     893  1.2277 0.2993
# Residuals 69  50200     728

contrasts(oats$Variety,how.many=1)<-matrix(c(1,-1,0),3,1)

anova(oats.2 <- asreml(fixed=yield ~ Variety, data=oats))
#               Df Sum of Sq Wald statistic Pr(Chisq)
# (Intercept)    1    778336           1070    <2e-16 ***
# Variety        2      1786              2    0.2930
# residual (MS)          728

anova(oats.x2 <- lm(yield ~ Variety, data=oats))
#          Df Sum Sq Mean Sq F value Pr(>F)
#Variety    1    336     336  0.4554  0.502
#Residuals 70  51650     738


Jason
View user's profile Send private message
Bruce Southey
Posted: Thu Jul 09, 2009 2:42 am Reply with quote
Guest
On Wed, Jul 8, 2009 at 5:16 AM, Clempson, Andrew
Martin<aclempson@rvc.ac.uk> wrote:
Quote:
Dear All



I wonder if someone could provide advice on dealing with missing values in
binary data models.

Avoid it! :-)

Quote:



I am in the process of performing association studies between SNPs and
pregnancy status in cows. I have genotypic information on approximately 90%
of animals and pregnancy status data on approximately 80% of animals (coded
as 0 = pregnant, 1 = not pregnant).

First some observations regarding missing records without considering the SNPs:
You probably should remove all animals with missing pregnancy status
from your data file but not the pedigree file. Just ensures the
correct data will be used.
If you have repeated records on the same animal, you must fit that
otherwise the genetic parameters are biased. If few animal have
repeated records, you probably need to delete the repeated records.
Since you are fitting an animal model, you need pedigree relationships
for all animals. If you have animals with no or limited genetic
relations you probably need to remove those.
Then fit your model without any SNPs to ensure that it does run and
the results are sensible (including solutions of animal effects).
Beware of the possibility for the occurrence of the extreme value
problem.

It is not a good idea to fit the actual SNPs rather you should do
association mapping (as been addressed by this list). That way you
will not have missing data due to missing SNPs.

If you really really must fit the actual SNPs, then if an SNP is very
rare then remove it because of the extreme category problem. If an SNP
is missing then you can impute it by different methods especially if
you know the location and surrounding SNPs - there is a genome now!

Bruce

This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of their organisation.

Post generated using Mail2Forum (http://www.mail2forum.com)
cullisb
Posted: Thu Jul 09, 2009 11:07 am Reply with quote
Guest
Dear jason
sorry the implementation is not there. We have decided to avoid the use of contrasts() in ASReml-R for good reasons.

We use reflexive generalised inverses to obtain a non-unique solution to the mixed model equations. These have useful properties and in doing this we avoid problems which other packages can run into with user enforced constraints, such as sum zero or others. It is still quite simple to set up any sort of contrasts you wish as long as you are aware of the potential perils of this

An example of this is given here where I use dummy variates to partition the 3 df for Nitrogen into 1 for 1vs4 and 2 for the rest. ASReml gracefully handles this with and you will see that 2 of the effects associated with Nitrogen have been aliased.
I hope this makes it clear.


require(asreml)
temp <- oats
temp$n1 <- as.numeric(temp$Nitrogen)
temp$n1 <- as.numeric(temp$n1==4) - as.numeric(temp$n1==1)
temp.asr <- asreml(yield~Variety + n1 + Nitrogen + Variety:n1 + Variety:Nitrogen,
random=~Blocks/Wplots,data=temp)
wald(temp.asr)
Quote:
coef(temp.asr)$fi
effect
Variety_Golden_rain:Nitrogen_0.2_cwt 0.000000
Variety_Golden_rain:Nitrogen_0.4_cwt 0.000000
Variety_Golden_rain:Nitrogen_0.6_cwt 0.000000
Variety_Golden_rain:Nitrogen_0_cwt 0.000000
Variety_Marvellous:Nitrogen_0.2_cwt 0.000000
Variety_Marvellous:Nitrogen_0.4_cwt 0.000000
Variety_Marvellous:Nitrogen_0.6_cwt -0.500000
Variety_Marvellous:Nitrogen_0_cwt 11.666667
Variety_Victory:Nitrogen_0.2_cwt 0.000000
Variety_Victory:Nitrogen_0.4_cwt 0.000000
Variety_Victory:Nitrogen_0.6_cwt -2.500000
Variety_Victory:Nitrogen_0_cwt -9.666667
Variety_Golden_rain:n1 0.000000
Variety_Marvellous:n1 -7.500000
Variety_Victory:n1 5.000000
Nitrogen_0.2_cwt 0.000000
Nitrogen_0.4_cwt 0.000000
Nitrogen_0.6_cwt 10.166667
Nitrogen_0_cwt -50.833333
n1 16.166667
Variety_Golden_rain 0.000000
Variety_Marvellous 2.500000
Variety_Victory -3.833333
(Intercept) 114.666667


warm regards

Brian Cullis
Research Leader, Biometrics &
Senior Principal Research Scientist
NSW Department of Primary Industries
Wagga Wagga Agricultural Institute

Visiting Professorial Fellow
School of Mathematics and Applied Statistics
Faculty of Informatics
University of Wollongong

Professor,
Faculty of Agriculture, Food & Natural Resources
The University of Sydney

Adjunct Professor
School of Computing and Mathematics
Charles Sturt University


Phone: 61 2 6938 1855
Fax: 61 2 6938 1809
Mobile: 0439 448 591






This message is intended for the addressee named and may contain confidential information. If you are not the intended recipient, please delete it and notify the sender. Views expressed in this message are those of the individual sender, and are not necessarily the views of their organisation.

Post generated using Mail2Forum (http://www.mail2forum.com)
jason
Posted: Thu Jul 09, 2009 6:21 pm Reply with quote
Joined: 23 Sep 2008 Posts: 11
Dear Brian,

It worked great and had the same results as using "!CONTRAST n1 Nitrogen -1 0 0 1" in ASReml. Thanks!

Could you also please shed me some light on how to setup contrasts with interaction terms, say "Variety_Marvellous vs. Variety_Victory in Nitrogen_0.2_cwt"? I believe that the contrast involves both the Variety effect and the Variety by Nitrogen interaction.

Jason
View user's profile Send private message
jason
Posted: Fri Jul 10, 2009 1:20 am Reply with quote
Joined: 23 Sep 2008 Posts: 11
cullisb wrote:
Dear jason
It worked great and had the same results as using "!CONTRAST n1 Nitrogen -1 0 0 1" in ASReml. Thanks!

Could you also please shed me some light on how to setup contrasts with interaction terms, say "Variety_Marvellous vs. Variety_Victory in Nitrogen_0.2_cwt"? I believe that the contrast involves both the Variety effect and the Variety by Nitrogen interaction.
****this is simple, and only requires you to not fit the main effect of Nitrogen. SO form the

v1<- V-M - V_V contrast as in the text script and simply fit v1 + Variety + v1:Nitrogen + Variety:Nitrogen and then the effects for both interaction terms should be nested within Variety. This follows from the definition of

A/B = A + A:B

where B is "nested on" A


warm regards

Brian Cullis
Research Leader, Biometrics &
Senior Principal Research Scientist
NSW Department of Primary Industries
Wagga Wagga Agricultural Institute



Dear Brian,

Thanks, and sorry to keep bothering you. Does the "v1" your mentioned refer to the following?
Code:
temp$v1 <-(temp$Variety=='Marvellous' & temp$Nitrogen=='0.2_cwt') -
          (temp$Variety=='Victory' & temp$Nitrogen=='0.2_cwt')
Fitting the following model
Code:
temp.asr <- asreml(yield ~  v1 + Variety + v1:Nitrogen + Variety:Nitrogen,
                   random = ~ Blocks/Wplots, data=temp)
did not give me a similar result of the contrast as reported from SAS:
Code:
proc mixed data=oats;
   class Blocks Nitrogen Subplots Variety Wplots;
   model yield = Variety Nitrogen Nitrogen*Variety;
   random Blocks Wplots(Blocks);
   contrast "Marvellous vs Victory in 0.2_cwt" Variety 0 1  -1 
                                     Nitrogen*Variety 0 1 -1 0 0  0 0 0  0  0 0  0;
run;
Code:
                                                     Num     Den
                Label                                 DF      DF    F Value    Pr > F

                Marvellous vs Victory in 0.2_cwt       1      45       3.76    0.0588
Had I made any stupid mistakes?

Jason
View user's profile Send private message
Arthur
Posted: Fri Jul 10, 2009 3:33 am Reply with quote
Joined: 05 Aug 2008 Posts: 471 Location: Orange, NSW
Dear Jason,

Let me join the discussion.

To calculate the contrast "Variety_Marvellous vs. Variety_Victory in Nitrogen_0.2_cwt" is the interaction of a covariable '0 1 -1 derived from the coding of Variety (assuming order GoldenRain Marvelous Victory) and a covariate 0 1 0 0 derived from the coding of Nitrogen (assuming order 0cwt 0.2cwt 0.4cwt 0.6cwt)

Brian provide code to create the first contrast.
In his interaction example, he calculated the contrast for all 4 levels of Nitrogen.

ASReml-r has a model function at() which creates the single contrast.

So temp.asr <- asreml(yield ~ v1 + Variety + at(Nitrogen,2).v1 + v1:Nitrogen + Variety:Nitrogen,
random = ~ Blocks/Wplots, data=temp)

should separate out that particular contrast.

However, if you have multiple contrasts, you need to be careful in interppreting them;

One approach would be to calculate the predicted Variety:Nitrogen means
with their variance matrix, and then do a t test of the difference between the
mean for Marvelous:0.2cwt and Victory:0.2cwt

In this balanced example, this should give the same result but in general,
the test of a regression coefficient, an an ANOVA statistic may give different results.

I trust this helps explain the principles behind the calculation.

_________________
Arthur Gilmour

Retired Principal Research Scientist (Biometrics)
View user's profile Send private message Send e-mail Visit poster's website
jason
Posted: Sat Jul 11, 2009 7:11 am Reply with quote
Joined: 23 Sep 2008 Posts: 11
Arthur wrote:
Dear Jason,
............

One approach would be to calculate the predicted Variety:Nitrogen means
with their variance matrix, and then do a t test of the difference between the
mean for Marvelous:0.2cwt and Victory:0.2cwt

In this balanced example, this should give the same result but in general,
the test of a regression coefficient, an an ANOVA statistic may give different results.

I trust this helps explain the principles behind the calculation.


Thanks, Arthur! For the alternative approach using t test, how to calculate the degree of freedom then?

Jason
View user's profile Send private message
Arthur
Posted: Sat Jul 11, 2009 10:52 pm Reply with quote
Joined: 05 Aug 2008 Posts: 471 Location: Orange, NSW
Dear Jason,

For the alternative approach using t test, how to calculate the degree of freedom then?

Good question. In this particular case, the experiment is balanced and it is possible to identify which stratta the contrast belongs too and so use the DF for the appropriate strata which ASReml-r can produce (svc.asreml) .

In general, the calculation is not straight forward and you would need to use the ANOVA like Wald test to obtain it. I'm not familiar with the ASReml-R syntax
as I use the standalone version. Again, care needs to be used to make sure the test is representing the correct marginality when testing contrasts.

_________________
Arthur Gilmour

Retired Principal Research Scientist (Biometrics)
View user's profile Send private message Send e-mail Visit poster's website

Display posts from previous:  

All times are GMT
Page 1 of 1
Post new topic

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You can attach files in this forum
You can download files in this forum