gmane.comp.lang.r.general
http://blog.gmane.org/gmane.comp.lang.r.general
hourly11901-01-01T00:00+00:00Gmanehttp://gmane.org/img/gmane-25t.png
http://gmane.org
read multiple sheets of excel data into R
http://comments.gmane.org/gmane.comp.lang.r.general/330661
<pre>Hi all,
I tried to use the package "XLConnect" to read excel data into R. I got
the following error message:
Error : .onLoad failed in loadNamespace() for 'rJava', details:
call: fun(libname, pkgname)
error: No CurrentVersion entry in Software/JavaSoft registry! Try
re-installing Java and make sure R and Java have matching
architectures.
I tried read.xls and got the following error message:
perl executable not found. Use perl= argument to specify the correct
path.Error in file.exists(tfn) : invalid 'file' argument
Can anyone give me some input on this?
Thanks.
Hanna
[[alternative HTML version deleted]]
</pre>li li2016-05-28T17:55:50sandwich package: HAC estimators
http://comments.gmane.org/gmane.comp.lang.r.general/330659
<pre>Dear R users,
I am running a logistic regression using the rms package and the code looks as follows:
crisis_bubble4<-lrm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,data=Data_logitregression_movingaverage)
Now, I would like to calculate HAC robust standard errors using the sandwich package assuming the NeweyWest estimator which looks as follows:
coeftest(crisis_bubble4,df=Inf,vcov=NeweyWest)
Error in match.arg(type) :
'arg' should be one of "li.shepherd", "ordinary", "score", "score.binary", "pearson", "deviance", "pseudo.dep", "partial", "dfbeta", "dfbetas", "dffit", "dffits", "hat", "gof", "lp1"
As you can see, it doesn't work. Therefore, I did the same using the glm() instead of lrm():
crisis_bubble4<-glm(stock.market.crash~crash.MA+bubble.MA+MP.MA+UTS.MA+UPR.MA+PPI.MA+RV.MA,family=binomial("logit"),data=Data_logitregression_movingaverage)
If I use the coeftest() function, I get following results.
coeftest(crisis_bubble4,df=Inf,vcov=NeweyWest)
z test of coeff</pre>T.Riedle2016-05-28T17:01:50colored table
http://comments.gmane.org/gmane.comp.lang.r.general/330652
<pre>I want to print a table where table elements are colored according to the frequency of the bin. For example, consider below table.
Function values that I would like to print in the table
x.eq.minus1 x.eq.zero x.eq.plus1
y.eq.minus1 -20 10 -5
y.eq.zero -10 6 22
y.eq.plus1 -8 10 -14
Frequency table to color the above table
x.eq.minus1 x.eq.zero x.eq.plus1
y.eq.minus1 0.05 0.15 0.1
y.eq.zero 0.07 0.3 0.08
y.eq.plus1 0.05 0.15 0.05
In the resulting table, the element for (x = 0, y = 0) will be 6. This will be printed with a dark color background. The element for (x = -1, y = -1) will be -20. This will be printed with a light color background. And so on.
Thanks for your help,
Naresh
</pre>Naresh Gurbuxani2016-05-28T13:10:53code to provoke a crash running rterm.exe on windows
http://comments.gmane.org/gmane.comp.lang.r.general/330651
<pre>hi, here's a minimal reproducible example that crashes my R 3.3.0 console
on a powerful windows server. below the example, i've put the error (not
crash) that occurs on R 3.2.3.
should this be reported to http://bugs.r-project.org/ or am i doing
something silly? thanx
# C:\Users\AnthonyD>"c:\Program Files\R\R-3.3.0\bin\x64\Rterm.exe"
# R version 3.3.0 (2016-05-03) -- "Supposedly Educational"
# Copyright (C) 2016 The R Foundation for Statistical Computing
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# R is free software and comes with ABSOLUTELY NO WARRANTY.
# You are welcome to redistribute it under certain conditions.
# Type 'license()' or 'licence()' for distribution details.
# Natural language support but running in an English locale
# R is a collaborative project with many contributors.
# Type 'contributors()' for more information and
# 'citation()' on how to cite R or R packages in publications.
# Type 'demo()' for some demos, 'help()' for on-line help, or
# 'help.start()' for an HTML brow</pre>Anthony Damico2016-05-28T09:14:24Dynamically populate a vector by iteratively applying a function to its previous element.
http://comments.gmane.org/gmane.comp.lang.r.general/330643
<pre>I want to dynamically populate a vector by iteratively applying a
function to its previous element, without using a 'for' cycle. My
solution, based on a question I posted some times ago for a more
complicated problem (see "updating elements of a list of matrixes
without 'for' cycles") was to define a matrix of indexes, and then
apply the function to the indexes. Here's a trivial example:
# my vector, all elements still unassigned
v <- rep(NA, 10)
# initialisation
v[1] <- 0
# the function to be applied
v.fun = function(x) {
i <- x[1]
return(v[i]+1)
}
# The matrix of array indices
idx <- as.matrix(expand.grid(c(1:9)))
# Application of the function
r[2:10] <- invisible(apply(idx, 1, v.fun))
[Note that this example is deliberately trivial: v <-c(0:9) would
solve the problem. In general, the function can be more complicated.]
The trick works only for v[2]. I imagine this is because the vector is
not dynamically updated during the iteration, so all values v[2:10]
are retained as NA.
How can I solve the </pre>Matteo Richiardi2016-05-28T00:31:41Application of "merge" and "within"
http://comments.gmane.org/gmane.comp.lang.r.general/330642
<pre>Dear Rxperts!
Is there a way to compute relative values.. using within().. function?
Any assistance/suggestions are highly welcome!!
Thanks again,
Santosh...
___________________________________________________________________
A sample dataset and the computation "outside" within() function is shown..
q <- data.frame(GL = rep(paste("G",1:3,sep = ""),each = 50),
G = rep(1:3,each = 50),
D = rep(paste("D",1:5,sep = ""),each = 30),
a = rep(1:15,each = 10),
t = rep(seq(10),15),
b = round(runif(150,10,20)))
r <- subset(q,!duplicated(paste(G,a)),sel=c(G,a,b))
names(r)[3] <- "bl"
s <- merge(q,r)
s$db <- s$b-s$bl
G a GL D t b bl db
1 1 1 G1 D1 1 13 13 0
2 1 1 G1 D1 2 16 13 3
3 1 1 G1 D1 3 19 13 6
4 1 1 G1 D1 4 12 13 -1
5 1 1 G1 D1 5 19 13 6
[[alternative HTML version deleted]]
</pre>Santosh2016-05-27T23:00:14performance of svm (e1071) on windows 64 bits vs 32 bits
http://comments.gmane.org/gmane.comp.lang.r.general/330635
<pre>Dear R users,
I</pre>Frederico Arnoldi2016-05-27T18:38:10couldn't install pcalg package in R 3.1.3
http://comments.gmane.org/gmane.comp.lang.r.general/330631
<pre>Hi Dears!
Would you please let me know how I can install package pcalg for R version
3.1.3 ?
I've tried different ways but got error!
Thanks inadvance!
[[alternative HTML version deleted]]
</pre>Lida Zeighami2016-05-27T18:08:27Trimming time series to only include complete years
http://comments.gmane.org/gmane.comp.lang.r.general/330630
<pre>In bulk processing streamflow data available from an online database, I'm
wanting to trim the beginning and end of the time series so that daily data
associated with incomplete "water years" (defined as extending from Oct 1st
to the following September 30th) is trimmed off the beginning and end of
the series.
For a small reproducible example, the time series below starts on
2010-01-01 and ends on 2011-11-05. So the data between 2010-01-01 and
2010-09-30 and also between 2011-10-01 and 2011-11-05 is not associated
with a complete set of data for their respective water years. With the
real data, the initial date of collection is arbitrary, could be 1901 or
1938, etc. Because I'm cycling through potentially thousands of records, I
need help in designing a function that is efficient.
dat <-
data.frame(Date=seq(as.Date("2010-01-01"),as.Date("2011-11-05"),by="day"))
dat$Q <- rnorm(nrow(dat))
dat$wyr <- as.numeric(format(dat$Date,"%Y"))
is.nxt <- as.numeric(format(dat$Date,"%m")) %in% 1:9
dat$wyr[!is.nxt] <- </pre>Morway, Eric2016-05-27T18:04:17read.fortran format
http://comments.gmane.org/gmane.comp.lang.r.general/330629
<pre>Dear fellow R users:
I am reading a data (ascii) file with fortran fixed format, containing
multiple records. R does not recognize fortran's record break (a slash).
I tried to do the following but it does not work. Help appreciated.
60 FORMAT(1X,F6.0,5F8.6/1X,5F8.4,F10.6/1X,2F6.0,3E15.9,F8.0,F5.2,F5.3
* /1X,F7.0,2E15.9,F9.4,F5.3)
mydata<-read.fortran("G:/Journals/Disk1/12_restat_95/estimate/GROUPD.DAT",
c("1X","F6.0","5F8.6"/"1X","5F8.4","F10.6"
/"1X","2F6.0","3E15.9","F8.0","F5.2","F5.3"
/"1X","F7.0","2E15.9","F9.4","F5.3"),
col.names=c("year","w1","w2","w3","w4","w5","w6","v1","v2","v3",
"v4","v5","v6","z","chyes","chno","ec","vc","cvc",
"pop","ahs","fah","tnh","eq","vq","ups","zm1 "))
[[alternative HTML version deleted]]
</pre>Steven Yen2016-05-27T17:56:32How to replace all commas with semicolon in a string
http://comments.gmane.org/gmane.comp.lang.r.general/330620
<pre>Dear list,
Say I have a data frame
test <- data.frame(C1=c('a,b,c,d'),C2=c('g,h,f'))
I want to replace the commas with semicolons
sub(',',';',test$C1) -> test$C1 will only replace the first comma of a string.
How do I replace them all in one run? Thanks.
Jun
[[alternative HTML version deleted]]
</pre>Jun Shen2016-05-27T15:10:57Fitting quantile (or cdf?) function to data with specified percentiles
http://comments.gmane.org/gmane.comp.lang.r.general/330615
<pre>Hello,
I hope you can help me. In class, we were given an Excel worksheet with specified formulas that take the total score from a survey (or from a specific section) and convert it to a percentage, according to a table that assigns scores to a percentile. Since the formulas are too long and complicated (some have been input by hand) I figured we could fit the data with a function with some parameters. I plotted the table and sure it resembled a more-or-less symmetrical quantile function, and I wanted to use R to find a curve that fittted the data. Here it is:
percentile <- c(0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.05, 0.05, 0.05,
0.05, 0.05, 0.10, 0.10, 0.15, 0.15, 0.20, 0.25, 0.30, 0.35, 0.40, 0.45, 0.50,
0.55, 0.60, 0.65, 0.70, 0.75, 0.80, 0.90, 0.95, 0.95, 0.99, 0.99, 0.99, 0.99,
0.99, 0.99, 0.99, 0.99, 0.99)
score <- c(10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,
45, 46, 47, 48, 49, 50)
I looked up the</pre>Franco Danilo Roca Landaveri2016-05-27T02:51:21Sample selection using multiple logit or similar
http://comments.gmane.org/gmane.comp.lang.r.general/330614
<pre>Greetings-
I am seeking to fit a model using Heckman-style selection; however the
wrinkle is that the selection is into multiple categories, not a binary
in/out. In this case, selection is into the type of higher-education
institution a student attended; the goal is to estimate post-graduation
outcomes while correcting for selection into universities.
There is an old Stata file, svyselmlog, that may or may not solve this
problem in stata. Any guidance or pointers toward solving it in R would
be most welcome.
Thanks-
Andy Perrin
</pre>Andrew Perrin2016-05-26T21:00:26Package index.html pages in repository
http://comments.gmane.org/gmane.comp.lang.r.general/330610
<pre>Hi,
I am hosting a small repository (of partly non-public repositories) and was
wondering if it is possible to create the package-index page on CRAN (e.g.
here for drat https://cran.rstudio.com/web/packages/drat/drat.pdf)
automatically somehow out of an existing repo (i.e. create the package from
the descriptions, extract the manual, extract the vignette, link them etc.).
Thanks!
Holger
[[alternative HTML version deleted]]
</pre>Holger Hoefling2016-05-27T13:07:40Getting Rid of NaN in ts Object
http://comments.gmane.org/gmane.comp.lang.r.general/330603
<pre>Dear All,
I am sure the answer is a one liner, but I am banging my head against
the wall and googling here and there has not helped much.
Consider the following time series
tt<-structure(c(NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN,
NaN, NaN, NaN, NaN, NaN, NaN, NaN, NaN, 1133.09, 1155.77, 1179.12,
1182.85, 1133.43, 1103.36, 1081.19, 1058.55, 1056.95, 1059.13,
1018.18, 920.62, 865.99, 856.29, 841.58, 857.7, 852.71, 890.76
), .Tsp = c(1980, 2015, 1), class = "ts")
where the NaN do *not* occur internally. How can I automatically get
rid of them and adjust the start and end year of the time series
accordingly?
Many thanks
Lorenzo
</pre>Lorenzo Isella2016-05-27T10:14:32rollapply and difftime
http://comments.gmane.org/gmane.comp.lang.r.general/330598
<pre>Technically, the code below works and results in a column that I'm
interested in working with for further processing. However, it is both
inefficient on lengthy (>100 yr) daily time series and is, frankly, not the
R way of doing things. Using the 'Daily' data.frame provided below, I'm
interested to know the propeR way of accomplishing this same task in an
efficient manner. I tried combinations of rollapply and difftime, but was
unsuccessful. Eric
Daily <- read.table(textConnection(" Date Q
1911-04-01 4.530695
1911-04-02 4.700596
1911-04-03 4.898814
1911-04-04 5.097032
1911-04-05 5.295250
1911-04-06 6.569508
1911-04-07 5.861587
1911-04-08 5.153666
1911-04-09 4.445745
1911-04-10 3.737824
1911-04-11 3.001586
1911-04-12 3.001586
1911-04-13 2.350298
1911-04-14 2.661784
1911-04-16 3.001586
1911-04-17 2.661784
1911-04-19 2.661784
1911-04-28 3.369705
1911-04-29 3.001586
1911-05-20 2.661784"),header=TRUE)
Daily$Date <- as.Date(Daily$Date)
Daily$tmdiff <- NA
for(i in seq(2,length(Daily$Date),by=1)){
</pre>Morway, Eric2016-05-26T23:59:52Match Coordinates to NUTS 2 ID
http://comments.gmane.org/gmane.comp.lang.r.general/330594
<pre>Dear all,
I have downloaded the NUTS 2 level data from
library(“rgdal”)
library(“RColorBrewer”)
library(“classInt”)
#library(“SmarterPoland”)
library(fields)
# Download Administrative Level data from EuroStat
temp <- tempfile(fileext = ".zip")
download.file("
http://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_2010_60M_SH.zip
",
temp)
unzip(temp)
# Read data
EU_NUTS <- readOGR(dsn = "./NUTS_2010_60M_SH/data", layer =
"NUTS_RG_60M_2010")
# Subset NUTS 2 level data
map_nuts2 <- subset(EU_NUTS, STAT_LEVL_ == 2)
I also have data for a variable by coordinates, which looks like this:
structure(list(LON = c(-125.25, -124.75, -124.25, -124.25, -124.25,
-124.25), LAT = c(49.75, 49.25, 42.75, 43.25, 48.75, 49.25),
yr = c(2.91457704560515, 9.94774197180345, -2.71956412885765,
-0.466213169185147, -36.6645659563374, 10.5168056769535)), .Names =
c("LON",
"LAT", "yr"), row.names = c(NA, 6L), class = "data.frame")
I would like to match the coordinates to their correspondi</pre>Miluji Sb2016-05-26T21:30:30JM plotting
http://comments.gmane.org/gmane.comp.lang.r.general/330591
<pre>Dear Users,
When the joint modeling was run in ER, the run was completed successfully
with co-eff output. However, in the terms plotting, somehow the plot
(fitjoint.null) does not work, with following error message. Any advice
from the group? Thank you very much!
Best regards,
Jenny
Error in Zs * b[id.GK, , drop = FALSE] : non-conformable arrays
[[alternative HTML version deleted]]
</pre>Jennifer Sheng2016-05-26T17:19:44Scale y-labels based on a value with 'lattice'
http://comments.gmane.org/gmane.comp.lang.r.general/330590
<pre>Dear R-helpers!
I have a data frame storing data for word co-occurrences, average
distances and co-occurence frequency:
Group.1 Group.2 x Freq
1 deutschland achtziger 2.00 1
2 deutschland alt 1.25 4
3 deutschland anfang -2.00 1
4 deutschland ansehen 1.00 2
5 deutschland arbeit 0.50 2
6 deutschland arbeitslos -2.00 1
Now I want to plot a lattice 'dotplot' with the formula 'Group.2~x'.
This works fine.
However, I would like to scale the y-label (based on 'Group.2' according
the 'Freq' value using a log-scaled value (log(Freq+.5)). In other
words: the higher the 'Freq' value of a term, the bigger its label
should be printed in my dotplot.
The problem is that I cannot figure out how to tell lattice to scale
each y-label with according 'Freq' value. I am quite sure I should build
a function for scales=list(y=...), but I don't know how to it.
Many thanks in advance for your help!
Best,
Kimmo Elo
--
Åbo Akademi University / German studies
Turku,</pre>K. Elo2016-05-26T14:51:46Factor Analysis using weights for each variable
http://comments.gmane.org/gmane.comp.lang.r.general/330588
<pre>Hi R-users,
I have 1020 time series ( each of length 10,000), say, X1,X2,......,X1020
and I want to perform Factor Analysis using 50 factors on their correlation
matrix.
The issue is: for every series, I have a weight, i.e. *the series X_i has a
pre-defined weight of w_i* ( i = 1,2,...., 1020). I want to estimate the
factor loadings and specific variances in the model by optimizing the
likelihood function (assuming multivariate normality, as usual).
Is it possible to estimate the model parameters using the weights for each
time series variable in the objective function?
One comment here - For computational purposes or otherwise, it is *ok to
change my objective function* (instead of taking the likelihood function,
may be something like minimizing the weighted sum of squared specific
variances for the variables would make sense).
Any help with this will be really appreciated.
Regards,
Preetam
</pre>Preetam Pal2016-05-26T19:55:08Proportion of 1's in terminal nodes of CTREE()
http://comments.gmane.org/gmane.comp.lang.r.general/330586
<pre>Hi R-users,
I have created a Conditional Tree using the ctree function ( in package
partykit). The data had a factor - the y variable - and a host of
categorical x-variables.
Now, I want to find the proportion of cases where y = 1 in each of the
terminal nodes.
Is it possible to do so programmatically?
*Tree <- ctree( y~., data = mydata, ctree_control = c(maxdepth = 4))*
Regards,
Preetam
[[alternative HTML version deleted]]
</pre>Preetam Pal2016-05-26T19:33:05Search EngineSearch the mailing list at Gmanequery
http://search.gmane.org/?group=$group=gmane.comp.lang.r.general