<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.lang.r.linguistics">
    <title>gmane.comp.lang.r.linguistics</title>
    <link>http://blog.gmane.org/gmane.comp.lang.r.linguistics</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/522"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/516"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/511"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/506"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/503"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/502"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/496"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/490"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/486"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/485"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/475"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/472"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/467"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/463"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/456"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/439"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/437"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/425"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/412"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.linguistics/411"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/522">
    <title>Conflicting p-values from pvals.fnc</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/522</link>
    <description>&lt;pre&gt;Dear R-langers,

I'm trying to run a mixed effect model using the lmer() function and have
run into some issues in interpreting the p-values generated by
pvals.fnc(). The design is a between-subjects design, with two fixed
effects (condition &amp;amp; block; each with two levels), and one random effect
(subject). Additionally, I have a set of weights that I want to include.

When looking at the pvals.fnc() output,there appears to be a large
discrepancy between the pMCMC values and the t-statistic p-values. Whereas
one of the main effects and the interaction are far from significant
judging by the pMCMC values, they are highly significant when looking at
the t-statistic p-values (e.g. Condition: pMCMC = 0.2294; Pr(&amp;gt;|t|) = 0.0000
&amp;amp; Condition*Block: pMCMC = 0.3296; Pr(&amp;gt;|t|) = 0.0000) . I have read that
the t-statistic based p-values are less conservative, but the difference
between these two values seems really extreme.

Below some code that simulates the model and the data. The original data
set has two precise charac&lt;/pre&gt;</description>
    <dc:creator>Tom Gijssels</dc:creator>
    <dc:date>2012-03-26T17:15:46</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/516">
    <title>Simpler model with random slopes</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/516</link>
    <description>&lt;pre&gt;Hi all,

this question is ultimately based on Florian's lecture1 slides here:
http://hlplab.wordpress.com/2010/05/10/mini-womm/

I'm doing a mixed model logistic regression, with random intercepts
for items and random slopes for items with respect to the fixed effect
Indep2 (cf. slide 85):

(a) glmer(formula = Dep ~ 1 + (1 | Item) + (0 + Indep2 | Item) +
Indep1 + Indep2, data = my.data, family = binomial(link = "logit"))

As per slide 88, I can also reduce the random effects to (1 + Indep2 | Item):

(b) glmer(formula = Dep ~ 1 + (1 + Indep2 | Item) + Indep1 + Indep2,
data = my.data, family = binomial(link = "logit"))

It's not exactly clear to me what (1 + Indep2 | Item) does, since the
output of both (a) and (b) includes random intercepts for items and
random slopes for items by Indep2. At the same time, model (a) and (b)
differ in their exact estimates.

I would appreciate if someone could explain what the difference
between model (a) and (b) is.

Thanks
Sverre

&lt;/pre&gt;</description>
    <dc:creator>Sverre Stausland</dc:creator>
    <dc:date>2012-03-19T16:37:15</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/511">
    <title>Questions about reporting mixed-effects results</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/511</link>
    <description>&lt;pre&gt;Dear all,

I have a few questions about how to report the results of mixed-effects analyses for publication. I have been perusing the Jaeger &amp;amp; Kuperman presentation but a few questions remain.  

I have been asked by the reviewers to include a full regression table, which I take to comprise coefficient estimates, MCMC-based confidence intervals and MCMC-based p-value estimations.
-Should the model that I use to report these values contain uncentered predictors, centered predictors, or centered and scaled predictors?
-A few of my models involve random intercepts, and I believe that pvals.fnc() is not currently defined for models with random intercepts.  Do you have any suggestions for how I should report these models? 

My models contain many control variables and only one or two variables that I am actually concerned with.  As such, I have not worried about multicollinearity among the control variables.  I suppose I should just state this somewhere to facilitate the interpretation of the regression tables?

&lt;/pre&gt;</description>
    <dc:creator>Goldberg, Ariel M</dc:creator>
    <dc:date>2012-02-20T20:15:19</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/506">
    <title>negative deviances (again)</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/506</link>
    <description>&lt;pre&gt;Hi,

I am writing because I'm trying to run an lmer model and I keep getting
negative deviances (and positive log-likelihoods, etc.). I've reinstalled R
and updated all packages:

platform       x86_64-pc-mingw32
arch           x86_64
os             mingw32
system         x86_64, mingw32
status
major          2
minor          14.1
year           2011
month          12
day            22
svn rev        57956
language       R
version.string R version 2.14.1 (2011-12-22)

lme4 version 0.999375-42

But the problem persists. It's not due to the data set. For example, I've
rerun a simple model from Harald's languageR library (on the data set
lexdec):

library(languageR)
data(lexdec)
lmer(RT ~ Frequency + (1 | Subject), lexdec)

Linear mixed model fit by REML
Formula: RT ~ Frequency + (1 | Subject)
   Data: lexdec
    AIC    BIC logLik deviance REMLdev
 -858.4 -836.8  433.2   -880.9  -866.4
[snip]

Linear mixed model fit by REML
Formula: RT ~ Frequency + Trial + (1 | Subject)
   Data: lexdec
    AIC  BIC logLik devi&lt;/pre&gt;</description>
    <dc:creator>T. Florian Jaeger</dc:creator>
    <dc:date>2012-01-28T18:27:29</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/503">
    <title>questions about logit mixed model with R</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/503</link>
    <description>&lt;pre&gt;(I re send the message because it seems that it was not sent properly the last time)

In my experiment, subjects were exposed to artificial languages with different word orders (two of them frequent among world languages: SOV, SVO and two of them infrequent: VSO, OSV). After training, subject had to classify new sentences as "correct" or incorrect, according to what they have learned. Sentences could either be correct, contain a syntax violation or a semantic violation (mismatch between a scene and the sentences describing it). Dependent variables were response latency and accuracy (right or wrong answer). I'm trying to analyze the accuracy (1 = right answer, 0 = wrong answer) data using a mixed logit model with "word order (OSV, SVO, SOV, VSO)" and "type of sentence" (correct, semantic violation, syntax violation) as fixed factors, and subject as a random factor. Word order is a between subjects variable, while type of sentences is a repeated measures factor. 

My questions are:

1) In order to contrast ea&lt;/pre&gt;</description>
    <dc:creator>Angel Tabullo</dc:creator>
    <dc:date>2012-01-17T15:07:08</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/502">
    <title>Bielefeld Mixed Models Workshop 2012</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/502</link>
    <description>&lt;pre&gt;Dear colleagues, 
we would like to alert you to our upcoming workshop, which we believe may be interesting for many of the users of this list:


*** BiMM 2012: Bielefeld Mixed Models Workshop ***

Mixed-effects models are a powerful tool for the statistical analysis and modelling of psycho-/linguistic data and are increasingly used in experimental and corpus-oriented work. The Bielefeld Mixed Models Workshop (BiMM 2012) offers an opportunity to gain insight into the application of mixed-effects models. Focus will be placed on combining theoretical background with practical data analysis (using R). Participants can look forward to lively discussions with and between experts about how to use, report and interpret these methods.
The workshop is targeted at students and researchers from psycho-/linguistics and related disciplines working or willing to work with mixed-effects models. No prior knowledge of these models is required. However, participants should be familiar with general statistics and R.

Place and &lt;/pre&gt;</description>
    <dc:creator>Helene Kreysa</dc:creator>
    <dc:date>2012-01-03T09:37:34</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/496">
    <title>Collinearity and centering multi-level (more than 2levels) fixed predictors</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/496</link>
    <description>&lt;pre&gt;Dear r-lang users,

I have a set of binary data from a 2 by 3 design study. I centered the
two-level predictor ('LHcenter': local &amp;amp; LD) but did not centered the
3 level predictor ('cond': A, B, &amp;amp; C). As you can see below in the
triangular matrix towards the end of the lmer output, there is
significant collinearity; the absolute values of some of the
correlations are above 0.6.

---8&amp;lt;----------------------------------------------------------------------------------------------------------------------------

Generalized linear mixed model fit by the Laplace approximation
Formula: true ~ cond * LHcenter + (1 | subject) + (1 | items)
   Data: offlineTarget
   AIC   BIC   logLik deviance
  747.8 785.9 -365.9    731.8

Random effects:
 Groups        Name    Variance     Std.Dev.
 items     (Intercept)     0.13838     0.37199
 subject   (Intercept)    1.44652     1.20271
Number of obs: 864, groups: items, 30; subject, 29

Fixed effects:
                          Estimate    Std. Error     z value      Pr(&amp;gt;|z|)
(Int&lt;/pre&gt;</description>
    <dc:creator>Xiao He</dc:creator>
    <dc:date>2011-11-27T23:57:42</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/490">
    <title>Figuring out maximum random effects in mixed-effectregression models</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/490</link>
    <description>&lt;pre&gt;To whom it may concern:

I would like to use a mixed-effect regression approach to examine the
adaptation effect (i.e., learning) during sentence comprehension.

Right now, I am trying to figure out the maximum random effects.

One of the models produced the following summary



Random effects:

Groups

Name

Variance

Std.Dev.

Corr

Subject

(Intercept)

19867.54

140.952





cPrimeType

367.20

19.162

-1.000



cCondition

5558.49

74.555

-1.000



clogLength

8128.49

90.158

0.954



cLogPresOrder

3033.62

55.078

-0.216



cPrimeType:cCondition

17606.58

132.690

0.164



cPrimeType:cCondition:cPresOrder

351.92

18.759

0.142

Item

(Intercept)

2870.65

53.578





cCondition

3391.10

58.233

0.668



cCondition:cPrimeType

1224.53

34.993

-0.652

Residual



31654.56

177.917





Baayen, Davidson, and Bates (2008) noted that "the high correlation of the
intercept and slope for the subject random effects (-1.00) indicates that
the model has been overparameterized" (p. 395).

So, I inspected t&lt;/pre&gt;</description>
    <dc:creator>Sunfa Kim</dc:creator>
    <dc:date>2011-11-19T11:14:18</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/486">
    <title>Helmert (or really any) contrasts, log-likelihood,and specificity</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/486</link>
    <description>&lt;pre&gt;Hello everyone,

I'm trying to set up a model with a three-way contrast: between linguistic,
non-linguistic, and control. I'd like to compare all three to each other,
but for the moment, I'll focus on one set of Helmert contrasts:

  [,1] [,2]
l   -1   -1
n    1   -1
c    0    2

This should compare 1) linguistic to non-linguistic, and 2) 2 times control
to the average of linguistic and non-linguistic. So far so good.

My "baseline" model includes some control predictors, and more importantly,
random intercepts and slopes for subject and item. The slopes are by
"condition," and in this case I've used the contrasts, as follows:

baseline&amp;lt;-lmer(logRT~(1+contr1+contr2|subject)+(1+contr1+contr2|item)+controlpredictors)

This, in order to specify the "maximal" structure allowed/justified by the
data. So far still so good (right?).

Then I add the fixed effect contrasts:

ofInterest&amp;lt;-update(baseline,.~.+contr1+contr2)

The fixed effect output of this type of model indicates that, for one set
of RT's, the t-value f&lt;/pre&gt;</description>
    <dc:creator>Jason Kahn</dc:creator>
    <dc:date>2011-11-10T21:49:07</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/485">
    <title>pvals.fnc and F statistic</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/485</link>
    <description>&lt;pre&gt;Hello all,

I wonder if anyone can help me with a question about pvals.fnc and lmer.

I'm running some re-analyses on a dataset analysed some years ago (2007), working with the languageR package.

In the earlier analyses, to obtain values for reporting, we used the then current version of pvals.fnc, with the following code:

x=pvals.fnc(model.lmer)
x$summary
x$anova

x$summary gave outputs like:

#                  Estimate Std.Error  DF t.value   pvals ci950 ci990 ci999
#(Intercept)        42.4514    2.1493 973  19.751 0.00000  TRUE  TRUE  TRUE
#typePsPr          -21.2240    1.7903 973 -11.855 0.00000  TRUE  TRUE  TRUE
#stressPN           -5.0640    0.8967 973  -5.648 0.00000  TRUE  TRUE  TRUE
#MorDMis             2.5627    1.7903 973   1.431 0.15275 FALSE FALSE FALSE
#typePsPr:stressPN   5.8140    1.0298 973   5.646 0.00000  TRUE  TRUE  TRUE
#stressPN:MorDMis    1.6569    1.0296 973   1.609 0.10794 FALSE FALSE FALSE

while x$anova gave e.g.

#            Df  SumSq MeanSq Denom         F   pvals
#type      &lt;/pre&gt;</description>
    <dc:creator>Rachel Smith</dc:creator>
    <dc:date>2011-11-09T00:34:33</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/475">
    <title>Main effects of categorical predictors in lmer</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/475</link>
    <description>&lt;pre&gt;Dear R users,

I’m using mixed effects models (lmer) to predict a binary dependent variable
as a function of 1.a categorical predictor (A)with 2 levels (A1 and A2) , 2.
another categorical predictor (B) with three levels (B1, B2 and B3) and 3.
The interaction between these two predictors. I have tried two models but
they return different results and I’m not sure which one is correct. I’m
interested in the main effect of B and the interaction between A and B
(because A alone has a significant effect in both models). My problem is
that there seem to be two sensible ways of examining the main effect of B:
1. to helmert code and 2. to center.  But these two methods produce opposite
results! I don’t know which one I should use. Here are the two models with
some details and their outputs:


Model 1: ‘A’ is centered. ‘B’ is helmert coded (‘B1’(baseline)=2, ‘B2’=-1,
‘B3’=-1) so that I can get a main effect of B by checking to see whether
baseline condition in B differs from the mean of B1&lt;/pre&gt;</description>
    <dc:creator>hossein karimi</dc:creator>
    <dc:date>2011-10-10T14:05:10</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/472">
    <title>Advice needed on reading large text file in R</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/472</link>
    <description>&lt;pre&gt;I have a text file encoded in CSV format that I plan to run a series of
linguistic analysis upon using R. My first thought is to use the read.csv
function to get the data into R. But suspect that this might be naive as the
file itself contains over 250,000 records and occupies over 1.5Gbytes of
disk space in its raw form. Each record has a large text field that extends
of multiple lines of the file; in some records this is equivalent to a few
paragraphs in other the text would run to pages. There is a second multiline
text field but this is much shorter --- maybe 20 different phrases. There
appear to escaped quotes throughout; though with a file this size verifying
this is difficult and some text editor crash trying to read it all in. I am
therefore dubious that read.csv is the right mechanism and seeking a better
method.

I am open to suggestions for input method and even segmentation if
necessary, provided segmentation does not prevent analysis of the entire
data. Would access to the records by R be any qu&lt;/pre&gt;</description>
    <dc:creator>Trevor Jenkins</dc:creator>
    <dc:date>2011-09-08T12:34:53</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/467">
    <title>contrast coding and lrm</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/467</link>
    <description>&lt;pre&gt;Dear R users,

We created a logistic regression model to investigate the influence of different factors on Dutch word order variation. To avoid collinearity and to increase interpretability of the effects, we decided to center the predictors. For one predictor with two levels and one ordinal (5 point scale) predictor, we did this by subtracting the mean; for a third predictor with three levels, we used contrast coding. The levels of the latter predictor (called PP_TYPE_3) were coded as follows:

     [,1] [,2]
abs     1    0
loc     0    1
temp   -1   -1

The model summary gives us the following:

                  Coef     S.E.    Wald Z P     
Intercept         -0.26760 0.10303 -2.60  0.0094
cDEF3S            -0.27503 0.06216 -4.42  0.0000
cANIM2_S          -0.17869 0.18324 -0.98  0.3295
PP_TYPE_3=loc      0.60699 0.13471  4.51  0.0000
PP_TYPE_3=temp     0.08269 0.12773  0.65  0.5174
cDEF3S * cANIM2_S -0.38080 0.12298 -3.10  0.0020

The names suggest that the model gives the estimates for the levels 'loc' &lt;/pre&gt;</description>
    <dc:creator>J. Vogels</dc:creator>
    <dc:date>2011-08-30T12:57:21</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/463">
    <title>Question about power</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/463</link>
    <description>&lt;pre&gt;Hi,

I have a question about making inferences when power might be an issue.  I'm examining whether a variable has a significant effect in different parts of the syllable.  To do this, I have 2 different data sets, Onset and Coda, which I'm using to determine if the variable has effects in the syllable onset and coda, respectively.  The variable is significant (very small p-value) in the onset but is marginally significant in the coda (p= .055 in the full model, and model comparison with a baseline model that does not contain this variable gives a p-value of .07).  

While it's always difficult to know how to interpret a marginally significant effect, one issue that complicates the matter is that the Coda dataset has fewer items and trials than the Onset dataset.  One thing that I'd like to do is determine whether the marginal effect could simply be due to a lack of power.  My idea was to take a random sample of the Onset dataset so that it matches the size of the coda dataset and see if the variable of inte&lt;/pre&gt;</description>
    <dc:creator>Ariel M. Goldberg</dc:creator>
    <dc:date>2011-08-03T21:26:13</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/456">
    <title>Positive and negative logLik and BIC in model comparison(lmer)</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/456</link>
    <description>&lt;pre&gt;Dear list,

 I have had a problem with model comparison for several months, so now
 I finally worked up my courage to ask for your help and hope that you
 can settle the question.

 I have frequently encountered positive logLik values and now heard
 that this might be due to bug in the lmer function. However, I also
 recently found Douglas Bates stating that "a positive log-likelihood
 is acceptable in a model for a continuous response" in an S-list.
 Positive logLiks appear in Baayen's 2008 introductory book, always
 together with negative AIC and BIC. He does not seem to treat them as
 erroneous. Instead, if I understood correctly, he chooses the model
 with more negative AIC/BIC (smaller value) and more positive logLik
 (larger value) as the better model in these comparisons.
 So did I get it right and is this the way to go or is there a bug that
 inverts the polarity of the numbers?

 As second question: Is there a general rule of thumb for cases when
 AIC and BIC point into different directions? Does it&lt;/pre&gt;</description>
    <dc:creator>Anja Arnhold</dc:creator>
    <dc:date>2011-08-01T08:29:40</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/439">
    <title>p-values from pvals.fnc</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/439</link>
    <description>&lt;pre&gt;Dear R-users,

I have been wondering about something with the pvals.fnc function. As we
know, the pvals function gives two p-values, one based on the posterior
distribution (pMCMC) and one based on the t-distribution. In my experience
most of the time the two values are very similar. However, I have recently
come across situations where they are wildly different. I have been
particularly surprised to see t-values above 2 that have associated pMCMC
values that are not even close to significance, while at the same time the
t-distribution based p-value is significant. For example, a recent model I
worked with looked something like this:

model1 = lmer(RT~x*y+(1+x|Subject)+(1|Item)

and gave me a t-value of 2.07 for the interaction, with a pMCMC p-value of
0.4756 and a t-distribution p-value of 0.0381. Obviously I like one of these
better than the other! I know that the latter p-value is anticonservative,
but the magnitude of the discrepancy is nonetheless surprising to me, given
the t-value. I'd be very gratefu&lt;/pre&gt;</description>
    <dc:creator>Jakke Tamminen</dc:creator>
    <dc:date>2011-07-29T19:58:25</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/437">
    <title>Mixed model for eyetracking data anlaysis</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/437</link>
    <description>&lt;pre&gt;Dear ling-r-lang users,

I'm writing to get some advice on the use of logit mixed model for
eyetracking data anlaysis. My experiment has three IVs with two levels for
each - language (English vs.Korean),stress pattern (trochaic vs.  iambic),
phonation type (aspirated vs.lax), and  one continuous IV, time. And the
DV is binary, either 0 or 1. What I want in the test is where in the time
course of word recognition, the trochaic and iambic words are different in
their activation of the target words, and how they are interacting with
the phonation types in the two language groups.

For this, I first tried logit mixed effect model as below.

lmer(gaze~stress*lg*phonation*time+(1|subj)+(1|item), data
,family=="binomial")

The issue that I had with this model is that it doesn't show interactions
between specific levels of factors. For example, I couldn't test whether
English speakers' behavior for aspiraed trochaic words (default level) is
different from the one for aspirated iambic ones.

So I have made a dummy co&lt;/pre&gt;</description>
    <dc:creator>Jeonghwa Shin</dc:creator>
    <dc:date>2011-07-28T17:07:56</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/425">
    <title>Embedding phonetic symbols in R</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/425</link>
    <description>&lt;pre&gt;Hi all,

I'm sure I'm not the only Windows user out there trying to embed IPA
symbols in R graphics and then make a pdf out of it. If anyone knows
how to do that, I would appreciate some help. Because I have not
succeeded.

Here is what I have tried. I'm creating a plot in R with IPA phonetic
symbols, using the standard
font Doulos SIL
(http://scripts.sil.org/cms/scripts/page.php?item_id=DoulosSILfont&amp;amp;_sc=1)

When attempting to create a pdf of this plot in R using the pdf()
function, I will get the following error messages:

Error in text.default(my.x, my.y,  :
 Invalid font type
In addition: Warning messages:
1: In text.default(my.x, my.y,  :
 font family not found in PostScript font database

I've been advised that I can resolve this problem using the Cairo
package for R. But I have not succeeded. Here is my call:


It produces a pdf file, but all the IPA symbols come out as boxes.

Please note that R has no difficulties producing this plot in its
graphic window with IPA symbols. It's only the pdf output I&lt;/pre&gt;</description>
    <dc:creator>Sverre Stausland</dc:creator>
    <dc:date>2011-07-20T19:33:49</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/412">
    <title>lmer: Significant fixed effect only when random slope isincluded</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/412</link>
    <description>&lt;pre&gt;Dear R users,

I have a logit mixed model with two categorical predictors (two types of salience measures) and a categorical dependent variable (pronoun used Y/N). One predictor has 2 levels, and the other has 3. I centered the 2-level predictor, and transformed the 3-level predictor into two binary predictors using contrast (sum) coding. I determined the random-effects structure by starting from a full model, and eliminating step by step all terms without a significant contribution to the model.

In the final model, I end up with random intercepts for subjects and items, and a by-subject random slope for my 2-level predictor. In this model, I get significant interactions between the fixed factors, which I had not expected to be significant by just looking at the data. Removing the random slope from the model completely eliminates these interactions, but model comparison suggests the random slope should be included. I have attached the two model summaries below.

Now my question is: is it normal to find such&lt;/pre&gt;</description>
    <dc:creator>Jorrig Vogels</dc:creator>
    <dc:date>2011-05-11T09:50:22</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/411">
    <title>Random effect modelling and Zipf distributed corpus data</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/411</link>
    <description>&lt;pre&gt;Dear R-lang-ers,


I am currently trying to do some modelling on corpus extracted data with my students using a random intercept model.

Our model tries to predict the position of French attributive adjectives wrt to their head noun given several variables.
Our setting is a logistic regression with a random variable (random intercept) set on an adjective lemmata variable.

Using lmer(...) we have that :
(1) the distribution of the conditional modes is all but normal.
(2) lmer does not converge properly : (the deviance function is getting flat as the algorithm progresses towards the solution, hence lmer iterates way too many times and yields an overfitted model)

We tried to solve issue 2 : on convergence, we have been able to get a better convergence with the library lme4a.

However problem (1) remains : the words lemmatas being Zipf distributed, most random intercepts have estimates close to 0 (the grand mean).
These random intercepts are also mostly those for the words with low frequency in the data (hapax&lt;/pre&gt;</description>
    <dc:creator>Benoit Crabbé</dc:creator>
    <dc:date>2011-03-03T13:23:56</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.linguistics/410">
    <title>ANOVA type main effects</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.linguistics/410</link>
    <description>&lt;pre&gt;Hi dear r-lang users,

I have a question about ANOVA type of main effects. From what I've read, one
way to obtain the type of main effect one would get with ANOVA is to do
model comparisons, like below:

mod1&amp;lt;-lmer(dv ~ iv1 + iv2 + (1|subject), data)
mod2 &amp;lt;-lmer(dv ~iv1 + (1|subject), data)
anova(mod1, mod2).

A significant difference indicates that the factor iv2 make significant
contribution to the model.

However, I wonder if there are other ways to obtain the same information.
Specifically, I have a 2X2 design, I wonder if, after I center the two
2-level factors, the results would be equivalent to ANOVA type of main
effects as opposed to simple effects. Thank you in advance!


Xiao
&lt;/pre&gt;</description>
    <dc:creator>Xiao He</dc:creator>
    <dc:date>2011-02-23T19:42:43</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.lang.r.linguistics">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.lang.r.linguistics</link>
  </textinput>
</rdf:RDF>

