gmane.comp.lang.r.general
http://blog.gmane.org/gmane.comp.lang.r.general
hourly11901-01-01T00:00+00:00Gmanehttp://gmane.org/img/gmane-25t.png
http://gmane.org
apply if else statement to vector
http://comments.gmane.org/gmane.comp.lang.r.general/313897
<pre>Subscribers,
What is the correct syntax to apply the 'if else' conditional statement
to vector objects?
Example:
vectorx<-c(50,50,20,70)
vectory<-c(50,50,20,20)
vectorz<-function () {
if (vectorx>vectory)
vectorx
else vectorx<-0
}
vectorz()
Warning message:
In if (vectorx > vectory) vectorx else vectorx <- 0 :
the condition has length > 1 and only the first element will be used
The help manual (?'if') explains that only length=0 is acceptable; what
is an appropriate alternative function to use please?
The desired result for vectorz is:
0 0 0 70
</pre>rl< at >openmailbox.org2014-10-02T09:01:57pmnorm to find cdf with linear decision bound
http://comments.gmane.org/gmane.comp.lang.r.general/313894
<pre>Hi all,
I need help calculating the error rate of an optimal responder to a
multidimensional discrimination task. I have been trying to use pmnorm
to do this, but am not sure about its functionality. I'm working with a
dataset of responses (binary, attack/not attack) of subjects who were
asked to discriminate between two different, overlapping categories of
stimuli. The stimuli subjects viewed were bivariate normal in their
distributions (squares with different mean blue:yellow ratios and
sizes). It is easy to find a threshold (a line) on the bivariate plane
that describes the optimal discrimination function. It is a diagonal. To
find the error rate committed by this optimal responder, I need to find
the cumulative distribution function for a bivariate normal that lies
above the line. However, pmnorm only seems to calculate rectangular
probabilities, i.e. only uses limits of integration perpendicular to the
axes. Is there another function I could use?
Thanks,
David
The problem is visualized w</pre>David Kikuchi2014-10-02T01:05:12Best Beginner Books?
http://comments.gmane.org/gmane.comp.lang.r.general/313890
<pre>Hey Folks,
I’m hoping to get a general consenus on a good book for someone with no prior experience in R that is new to data science and statistical analysis. So far, I’ve been recommended to read “Software For Data Analysis: Programming With R (Statistics And Computing)" by John Chambers. I’ve seen some other books mentioned here and there in the mailings, but I can’t recall their names. Does anyone have any though on this book, or others?
Best Regards,
Jason Eyerly
</pre>Jason Eyerly2014-10-02T00:48:25optimization question
http://comments.gmane.org/gmane.comp.lang.r.general/313889
<pre>Dear All,
please provide help with the following:
we have
a <-c(0,1,1,0,1,0,0,0,0)
b <-c(0,0,0,1,0,0,0,0,0)
c <-c(1,0,1,0,1,1,0,0,0)
d <-c(0,1,0,1,0,1,0,0,0)
df <-rbind(a,b,c,d)
df <-cbind(df,h=c(sum(a)*8,sum(b)*8,sum(c)*8,sum(d)*8))
df <-cbind(df,df[,8]*c(1,2,3,2))
I would like to minimize the value for sum(df[,9]) under the following conditions:
1. all values of a,b,c, and d are binary variables and are the variables to change to get the optimal result
2. sum(a), sum(b), and sum(d) should be each 5 or more
3. sum(c) should be 3 or less
4. a[2], a[3], b[2], d[7] and d[8] are fixed to their current values.
any thoughts or reference examples you could help with is greatly appreciated
thanks
andras
</pre>Andras Farkas2014-10-02T00:30:26How to check to see if a variable is within a range of anothervariable
http://comments.gmane.org/gmane.comp.lang.r.general/313879
<pre>Is there an easy way to check whether a variable is within +/- 10%
range of another variable in R?
Say, if I have a variable 'A', whether its in +/- 10% range of
variable 'B' and if so, create another variable 'C' to say whether it
is or not?
Is there a function that is able to do that?
eventual outcome:
A B C
67 76 no
24 23 yes
40 45 yes
10 12 yes
70 72 yes
101 90 no
9 12 no
</pre>Kate Ignatius2014-10-01T22:11:05ave documentation
http://comments.gmane.org/gmane.comp.lang.r.general/313877
<pre>Hi,
I was looking at the ave documentation and it seems awfully sparse. How about we change the description from:
Subsets of x[] are averaged, where each subset consist of those observations with the same factor levels.
to:
Subsets of x[] are averaged, where each subset consist of those observations within a combination of factor levels. The result will have the same length as x[].
Maybe that's not just what I want either. ave is so useful any time you want to basically split/apply and want the output to be the same lenghth. Here's an example...
x <- c('a', 'a', 'b', 'b', 'c', 'c', 'a', 'a')
f <- c(1, 1, 1, 1, 2, 2, 2, 2)
ave(x, f, FUN = order )
One would never have deduced that from the help alone and it seems like a primary function. Or, more advanced
as.numeric( ave(x, f, FUN = function(x) factor(x, levels = unique(x))) )
And if this isn't where I'd send help documentation suggestions where could I?
</pre>John Christie2014-10-01T17:56:15power.t.test threading on 'power'
http://comments.gmane.org/gmane.comp.lang.r.general/313875
<pre>Simple question. A vector of ‘number of observations’ can be input to power.t.test, and a vector of ‘power’ s is output. But, inputting a vector of powers generates an error. Am I missing something?
Vector of ’n’ s
power.t.test(n=c(28,29,30), delta=2, sd=3, sig.level=0.05, type="two.sample", alternative="one.sided")
Two-sample t test power calculation
n = 28, 29, 30
delta = 2
sd = 3
sig.level = 0.05
power = 0.7933594, 0.8058963, 0.8177506
alternative = one.sided
NOTE: n is number in *each* group
Vector of ‘power’ s
power.t.test(power=c(0.7,0.8,0.9), delta=2, sd=3, sig.level=0.05, type="two.sample", alternative="one.sided")
Error in uniroot(function(n) eval(p.body) - power, c(2, 1e+07)) :
f() values at end points not of opposite sign
In addition: Warning messages:
1: In if (is.na(f.lower)) stop("f.lower = f(lower) is NA") :
the condition has length > 1 and only the first element will be used
2: In if (is.na(f.upper</pre>Stephen Kennedy2014-10-01T12:29:16optimize
http://comments.gmane.org/gmane.comp.lang.r.general/313874
<pre>Page 53 of Robert and Casella's "Use R" book, Introduction to Monte Carlo
Methods with R, has the following code:
optimize(f=function(x){dbeta(x,2.7,6.3)},
+ interval=c(0,1) ,max=T)$objective
This should return
[1] 2.669744
I run R from R Studio. When I enter this code I receive the following error
message: Error in f(arg, ...) : unused argument (max = TRUE)
Can someone help me understand why I get this error?
Thanks.
NickG
[[alternative HTML version deleted]]
</pre>Nick gayeski2014-10-01T18:20:35Opening netCDF file with large number of variables but size is small in R
http://comments.gmane.org/gmane.comp.lang.r.general/313865
<pre>Dear all,
I am trying to open a netcdf file with size 1.2 MB contains more than
3000 variables using 'netcdf' package.
I am facing problem that it taking more than 10 minute to open this small
nc file.
Is any way to make it fast ?
I have attached one sample nc-file along with this mail.
Thank you all in advance.
</pre>കുഞ്ഞായി kunjaai2014-10-01T11:11:07Print list to text file with list elements names
http://comments.gmane.org/gmane.comp.lang.r.general/313864
<pre>Hi everyone,
I want to write a list of data frames to a text file and preserve the names given to the list elements (I am using R 3.1.0).
I tried:
setNames(myList, myNames) # myNames is a vector of char elements same length as myList
sink(sprintf("%s",filename))
lapply(myList,print)
sink()
And here I have two problems:
1. R writes each element of my list to the text file twice, so for example if I have a list with 2 elements (i.e. data frames) in it, it will write 4: in order data frame 1, data frame 2, data frame 1, data frame 2.
2. The names of list elements do not print to the text file.
Any suggestions to solve these issues would be appreciated!
Many thanks,
Ingrid
________________________________
This message and any attachments contain information that may be RMS Inc. confidential and/or privileged. If you are not the intended recipient (or authorized to receive for the intended recipient), and have received this message in error, any use, disclosure or distribution is strictly</pre>Ingrid Charvet2014-10-01T10:59:57aov and groups coding
http://comments.gmane.org/gmane.comp.lang.r.general/313863
<pre>please consider the following example:
#start code
set.seed(123)
level<-rnorm(18, 10,3)
group1<-rep(letters[1:3], each=6)
summary(aov(level~group1))
group2<-rep(1:3,each=6)
str(group2)
summary(aov(level~group2))
#same result as for group1
summary(aov(level~factor(group2)))
#same result ad for aov
anova(lm(level~group2))
#end code
what I would like to do is to perform an anova among groups (analysis of
variance for three different gruops);
consider that groups are completely arbitrary: they are not intended to
have any sort of scaling or ordinal meaning;
in my example same groups are coded in two alternative ways: group1 as
"chr" (factor) and group2 as "num"; so by keeping in mind my purpose (is
there any difference in the level among groups?) I would simply consider
the result of aov() for group2 (num) as a non sense (with respect to my
specific purpuse)
is that a correct interpretation?
I hope not having misinterpreted the indications of the following thread
http://r.789695.n4.nabble.com/Que</pre>Massimo Bressan2014-10-01T11:00:34uninstalled and reinstalled R, now cannot install packages
http://comments.gmane.org/gmane.comp.lang.r.general/313859
<pre> Hello,
I have a student (running Windows 8) who ran into a problem installing the
RcmdrPlugin.IPSUR package and, despite instructions to the contrary,
uninstalled R and reinstalled. He uninstalled it using the Windows 8
utility. Now he cannot install any packages.
Here are the error messages he gets when he gets when he trys to use the
menu to install a package:
--- Please select a CRAN mirror for use in this session ---
Warning: unable to access index for repository
http://cran.mirrors.hoobly.com/bin/windows/contrib/3.1
Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/3.1
Error in install.packages(NULL, .libPaths()[1L], dependencies = NA, type =
type) :
no packages were specified
In addition: Warning message:
In open.connection(con, "r") :
unable to connect to 'cran.r-project.org' on port 80.
Here are the messages he gets when using the install.packages command (it
doesn't matter what package he tries to install):
Installing package into
‘C:/Us</pre>Sarah Hardy2014-09-30T23:00:56Using R for modelling Australian Senate Election in NSW?
http://comments.gmane.org/gmane.comp.lang.r.general/313858
<pre>People,
I am setting up an Australian Science and Technology party with Human
Health and Longevity as it's defining, first platform plank:
http://lestp.org
and intend to run Senate candidates at the next Federal Election (due
late 2016). Senators are elected from each State and Territory using a
Proportional Representation (PR) method but it is complex and odd things
happen with direction of preferences when the smallest groups or
individuals are progressively eliminated from the count when producing
the final quotas. I have a little experience with R but it seems like
it might be useful for modelling this situation? A quota for a
half-senate election (six senators to be elected in each state) is one
seventh of the total number of votes + one vote.
My current thoughts are these:
- each Senate vote corresponds to a vector with say 80 numbers on it
(corresponding to 80 candidates - some grouped into parties, some as
individuals)
- the order of the numbers from 1-80 could be random on the</pre>Philip Rhoades2014-10-01T00:32:57Reading text file with fortran format
http://comments.gmane.org/gmane.comp.lang.r.general/313852
<pre>Hello
I read data with fortran format:
mydata<-read.fortran('foo.txt',
c("4F10.4","F8.3","3F3.0","20F2.0"))
colnames(mydata)<-c("q1","q2","q3","q4","income","hhsize",
"weekend","dietk","quart1","quart2","quart3","male","age35",
"age50","age65","midwest","south","west","nonmetro",
"suburb","black","asian","other","hispan","hhtype1",
"hhtype2","hhtype3","emp_stat")
dstat(mydata,digits=6)
I produced the following sample statistics for the first 4
variables (q1,q2,q3,q4):
Mean Std.dev Min Max Obs
q1 0.000923 0.002509 0 0.035245 5649
q2 0.000698 0.001681 0 0.038330 5649
q3 0.000766 0.002138 0 0.040100 5649
q4 0.000373 0.001140 0 0.026374 5649
The correct sample statistics are:
Variable| Mean Std.Dev. Minimum Maximum
--------+----------------------------------------------------
Q1| 9.227632 25.09311 0.0 352.4508
Q2| 6.983078 16.80984 0.0 383.2995
</pre>Steven Yen2014-09-30T21:04:05Best Distribution
http://comments.gmane.org/gmane.comp.lang.r.general/313850
<pre>Dear useRs,
I have this following data
c(42.2, 45.2, 46, 48, 54, 54.1, 59.4, 61, 62.2, 63.5, 65.024, 71.9, 73.4, 76.6, 76.708, 77.5, 77.724, 78, 81.3, 84.7, 84.836, 85.09, 88.2, 91.4, 94, 95.8, 96, 97.3, 101, 101, 101.5, 102.3, 102.87, 108.7, 109.5, 110.5, 110.7, 112, 114.3, 118.11, 121.412, 128.1, 131, 140, 142, 143.3, 151.4, 153.7, 189.4, 214.3)
I want to fit gumbel and log-normal distribution on it on the same window to see which distribution fits it the best way.
Thankyou very much in advance,
Eliza
[[alternative HTML version deleted]]
</pre>eliza botto2014-09-30T19:50:33Converting factor data into Date-time format
http://comments.gmane.org/gmane.comp.lang.r.general/313848
<pre>Hello R help:
I am
new to this forum so I apologize in advance for any protocol missteps. I
have a data set that is comprised of eight birds with GPS; each of which
transmit everyday at 8:00 am, 4:00 pm, and midnight for 1 year (although I have
some missing relocation's). I am trying to format my data to be run in
adehabitatLT but I am unsuccessful. I have a "csv" file with
the following header: "Craneid, Date, Time, Long, Lat, Habitat,
BurstID". R creates factor levels in the all of the data except Lat,
Long. I have attempted the following to correctly format my date and time
factors (data=l10):
First
attempt:
1.
datetime=as.POSIXct(paste(l10$Date, l10$Time), format="%m/%d/%Y
%H:%M:%S", "America/Chicago")
2.
coord=data.frame((l10$Longitude), (l10$Latitude))
3.
test=as.ltraj(coord, datetime, l10$Craneid, burst=l10$ID, typeII=TRUE)
Results:Error
in as.ltraj(coord, datetime, l10$Craneid, burst = l10$ID, typeII = TRUE) :
non unique dates for a given burst
I
researched this err</pre>tandi perkins2014-09-30T17:54:35Inverse Student t-value
http://comments.gmane.org/gmane.comp.lang.r.general/313836
<pre>Dear Sir/Madam,
I am trying to use calculation for two-tailed inverse of the student`s
t-distribution function presented by Excel functions like
=TINV(probability, deg_freedom).
For instance: The Excel function =TINV(0.0000408831,1221) = returns
4.0891672.
Would you like to show me a manual calculation for this?
Appreciate your helps in advance.
Cheers! Andre
[[alternative HTML version deleted]]
</pre>Andre2014-09-30T17:31:05Course Halifax: Introduction to Linear mixed effects models, GLMM and MCMC with R
http://comments.gmane.org/gmane.comp.lang.r.general/313834
<pre>Apologies for cross-posting
We would like to announce the following statistics course:
Course: Introduction to Linear mixed effects models, GLMM and MCMC
with R
Location: Halifax, Canada
Date: 19 - 23 January 2015
Remaining seats: 10
Course website: http://www.highstat.com/statscourse.htm
Course flyer: http://www.highstat.com/Courses/Flyer2015_01Halifax.pdf
Kind regards,
Alain Zuur
</pre>Highland Statistics Ltd2014-09-30T16:39:25Regarding R package spatstat
http://comments.gmane.org/gmane.comp.lang.r.general/313831
<pre>Hi ,
I am installing a R Package as following -
R CMD INSTALL spatstat_1.26-0.tar.gz
It seems my installation hangs at
"byte-compile and prepare package for lazy loading"
I kept on waiting for 1/2anhr but it hanged at below stage.
Please help.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Target "all" is up to date.
installing to
/gpfs1/home/shivali/R-3.1.0/R-3.1.0/lib/R/library/spatstat/libs
** R
** data
*** moving datasets to lazyload DB
** demo
** inst
** byte-compile and prepare package for lazy loading
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~````
[ncmr0202][/gpfs1/home/shivali]> ps -ef | grep R-3.1.0
shivali 438938 446478 0 17:23:40 pts/9 0:00 grep R-3.1.0
shivali 635448 381678 0 17:12:03 pts/14 0:00 sh
/gpfs1/home/shivali/R-3.1.0/R-3.1.0/lib/R/bin/Rcmd INSTALL
spatstat_1.26-0.tar.gz
shivali 316220 635448 118 17:12:04 pts/14 10:58
/gpfs1/home/shivali/R-3.1.0/R-3.1.0/lib/R/bin/exec/R --args --args --args
nextArgspatstat_1.26-0.tar.gz
shivali 344832 1 118 16:23:05 pts/14 60:26
/gpfs1/home/shivali/R-3.</pre>shivali gangwar2014-09-30T11:58:43Using "survey" package with ACS PUMS
http://comments.gmane.org/gmane.comp.lang.r.general/313828
<pre>
I'm trying to reproduce some results from the American Community Survey
PUMS data using the "survey" package. I'm using the one-year 2012 estimates
for New Hampshire
(http://www2.census.gov/acs2012_1yr/pums/csv_pnh.zip) and comparing to the
estimates for user verification from
http://www.census.gov/acs/www/Downloads/data_documentation/pums/Estimates/pums_estimates_12.csv
Once the age groups are set up as specified in the verification estimates,
the following SAS code produces the correct estimated totals with standard
errors:
proc surveyfreq data = acs2012 varmethod = jackknife;
weight pwgtp;
repweights pwgtp1 -- pwgtp80 / jkcoefs = 0.05;
table SEX agegroup;
run;
I've not been successful in reproducing the standard errors with R,
although they are very close. My code follows; what revisions do I need to
make?
Thanks,
Mike L.
# load estimates for verification
pums_est <- read.csv("pums_estimates_12.csv")
pums_est[,4] <- as.integer(gsub(",", "", pums_est[,4]))
# load PUMS data
pums_p <- read.csv("</pre>Michael.Laviolette< at >dhhs.state.nh.us2014-09-30T13:17:28Loop does not work: Error in else statement (II)
http://comments.gmane.org/gmane.comp.lang.r.general/313825
<pre>
Hi to all members of R list,
I</pre>Frank S.2014-09-30T12:54:53Search EngineSearch the mailing list at Gmanequery
http://search.gmane.org/?group=$group=gmane.comp.lang.r.general