gmane.comp.lang.r.general
http://blog.gmane.org/gmane.comp.lang.r.general
hourly11901-01-01T00:00+00:00Gmanehttp://gmane.org/img/gmane-25t.png
http://gmane.org
customize R code in latex
http://comments.gmane.org/gmane.comp.lang.r.general/327734
<pre>Hello,
I am struggling with the fact of include a chunk of R code in a paper,
writing in latex.
I have seen some examples using listings() but I would like to keep R
colors.
So as to do that, I have written the code in Rmarkdown and when knitr I
keep the latex file so I obtain the attached files.
But, in order to fit in the paper, I would like to modify the dimensions of
the chunk (make it smaller) and also put it inside a black box.
Do you have any advise?
Thank you very much in advance.
------
Aurora González Vidal
Phd student in Data Analytics for Energy Efficiency
Faculty of Computer Sciences
University of Murcia
< at >. aurora.gonzalez2< at >um.es
T. 868 88 7866
www.um.es/ae
</pre>AURORA GONZALEZ VIDAL2016-02-10T19:34:46Calculate average of many subsets based on columns in anotherdataframe
http://comments.gmane.org/gmane.comp.lang.r.general/327732
<pre>Hello, I have a dataframe with a date range, and another dataframe
with observations by date. For each date range, I'd like to average
the values within that range from the other dataframe. I've provided
code below doing what I would like, but using a for loop is too
inefficient for my actual case (takes about an hour). So I'm looking
for a way to vectorize.
set.seed(345)
date.range <- seq(as.POSIXct("2015-01-01"),as.POSIXct("2015-06-01"),
by="DSTday")
observations <- data.frame(date=date.range, values=runif(152,1,100) )
groups <- data.frame(start=sample(date.range[1:50], 20), end =
sample(date.range[51:152], 20), average = NA)
#Potential Solution (too inefficient)
for(i in 1:NROW(groups)){
groups[i, "average"] <- mean(observations[observations$date >=
groups[i, "start"] & observations$date <=groups[i, "end"], "values"])
}
As an extension to this, there will end up being multiple value
columns, and each range will also identify which column to average. I
think if I can figure out the first problem </pre>Peter Lomas2016-02-10T20:18:36Installing Rstudio Addinexamples
http://comments.gmane.org/gmane.comp.lang.r.general/327730
<pre>Kindly assist. I run the below command in Rstudio
and get the following error
Downloading GitHub repo rstudio/addinexamples< at >masterfrom URL https://api.github.com/repos/rstudio/addinexamples/zipball/masterError in curl::curl_fetch_memory(url, handle = handle) : Timeout was reached
[[alternative HTML version deleted]]
</pre>Archbold Muhle2016-02-10T17:29:48MCA, Rcmdr, FactoMineR
http://comments.gmane.org/gmane.comp.lang.r.general/327719
<pre>Dear R users,
I am a beginner in R so my question may be a bit stupid. I tried to search
in forums and did not find the answer I am looking for. I should precise
that I am using Rstudio on Mac (OsX 10.10.5).
I want to run a MCA analysis on my data (with Benzecri correction, with
active and supplementary variables). It seems that the FactoMineR package
is doing this. However, it seems it is not working as it should. In Rstudio
console I call "library(Rcmdr)", and it opens a new window with XQuartz.
Then I want to upload FactoMineR package so I go in tool, I select
FactoMineR and validate. Then I am supposed to charge plug-ins but the
option is not available (see picture attached)
Do you know where it comes from?
Thank you very much for your help,
Sarah
</pre>Sarah Bortolamiol2016-02-10T11:01:00MonteCarlo sampling on profile log likelihood - package extRemes
http://comments.gmane.org/gmane.comp.lang.r.general/327718
<pre>Hi
I am using the package extRemes to assess 100-year return period
runoffs with the GEV and GP distribution functions and the associated
95% confidence intervals.
I use the MLE method for that.
Now I would like to sample a few thousands values of return levels on
the profile likelihood between the 95% confidence interval boundaries.
I saw that the function ‘profliker’ allows to sample log-likelihood
values along the profile likelihood.
Is there any way to sample return levels or get the return levels
corresponding to these log-likelihood values?
Thanks for any help.
Kind regards,
LCollet
______________________________________________
R-help< at >r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.</pre>Lila Collet2016-02-10T15:19:21Shiny
http://comments.gmane.org/gmane.comp.lang.r.general/327714
<pre>Hi Team,
Please anyone share the Shiny app code(Server and UI) for Multiple Linear
Regression. With dropdown menu
Thanks and Regards
Venkatesan
[[alternative HTML version deleted]]
</pre>Venky2016-02-10T12:41:30Shiny Import data set
http://comments.gmane.org/gmane.comp.lang.r.general/327713
<pre>Hi,
I am trying to import Excel data set into Shiny app and i want to to create
dropdown menu for that.
Eg:1st column of the Excel data must come one tab, and 2nd column of Excel
data is an another,...and so on
And i have seperate caluculation file like(Wordcloud,Binomial Reg etc). I
want to merge these file into Shiny and UI. Please do the needful to figure
out this issues.
Thanks and Regards
Venkatesan
[[alternative HTML version deleted]]
</pre>Venky2016-02-10T12:40:06Complete archives for MARC needed for r-devel,r-help and r-packages
http://comments.gmane.org/gmane.comp.lang.r.general/327709
<pre>I need the archives for r-help from October 2007 to November 2013 in MARC [1].
The same goes for r-devel [2] who needs archives for July 2005 to
November 2013 and r-packages [3] needing archives for August 2007 to
December 2013.
P.S.: I'm not subscribed to this mailing list.
[1] http://marc.info/?l=r-help
[2] http://marc.info/?l=r-devel
[3] http://marc.info/?l=r-packages
</pre>Tae Wong2016-02-10T03:15:20Revolutions blog: January 2016 roundup
http://comments.gmane.org/gmane.comp.lang.r.general/327707
<pre>Since 2008, Microsoft (formerly Revolution Analytics) staff and guests have written about R every weekday at the
Revolutions blog: http://blog.revolutionanalytics.com
and every month I post a summary of articles from the previous month of particular interest to readers of r-help.
And in case you missed them, here are some articles related to R from the month of January:
Animated visualizations and analysis of data from NYC's municipal bike program, created with R:
http://blog.revolutionanalytics.com/2016/01/new-yorkers-municipal-bikes-and-the-weather.html
Many local R user groups are sharing materials from meetups using Github:
http://blog.revolutionanalytics.com/2016/01/r-user-groups-on-github.html
A detailed R tutorial on analyzing your Twitter archive and performing sentiment analysis:
http://blog.revolutionanalytics.com/2016/01/twitter-sentiment.html
How to combine R and Python in Jupyter notebooks: http://blog.revolutionanalytics.com/2016/01/pipelining-r-python.html
Many datasets are available for</pre>David Smith2016-02-09T23:07:28GPU package crowd-source testing
http://comments.gmane.org/gmane.comp.lang.r.general/327705
<pre>Greetings R users,
I would like to request any users who would be willing to test one of my
packages. Normally I would be content using testthat and continuous
integration services but this particular package is used for GPU computing
(hence the cross-posting). It is intended to be as general as possible for
available devices but I only have access to so much hardware. I can't
possibly test it against every GPU available.
As such, I would sincerely appreciate any user that has at least one GPU
device (Intel, AMD, or NVIDIA) and is willing to experiment with the
package to try it out. Note, this will require installing an OpenCL SDK of
some form. Installation instructions for the package are found here (
https://github.com/cdeterman/gpuR/wiki).
At the very least, if you have a valid device, you would only need to
download the 'development' version of the package and experiment with the
functions such as a matrix multiplication.
devtools::install_github("cdeterman/gpuR", ref = "develop")
library(gpuR</pre>Charles Determan2016-02-09T18:20:19platform dependent regex
http://comments.gmane.org/gmane.comp.lang.r.general/327700
<pre>I just spent a day and a half debugging someone's code, only to
discover that the problem is platform dependent regular expressions.
For example:
## Windows:
grepl("\\W", "", "س") # TRUE
## OS X:
grepl("\\W", "", "س") # TRUE
## Linux:
grepl("\\W", "", "س") # FALSE
Ouch. The documentation does say "Certain named classes of characters
are predefined. Their interpretation depends on the _locale_", but
that doesn't seem to cover it given that the locale on OS X and Linux
was the same (en_US.UTF-8).
Question: Is this considered a bug, and if so what can I do to help
fix it? I've checked and the issue is present in both r-patched and
r-devel.
Best,
Ista
______________________________________________
R-help< at >r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.</pre>Ista Zahn2016-02-09T16:55:33R hangs when plot() is used
http://comments.gmane.org/gmane.comp.lang.r.general/327683
<pre>Dear R users
I have compiled R from source in local user account (at non default
location). R seems to be working fine but issuing plot() command opens a
window (supposedly graph, but nothing is visible) and then R terminal also
freezes. Any suggestions?
Following is the session info output.
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.7 (Final)
locale:
[1] LC_CTYPE=en_US LC_NUMERIC=C LC_TIME=en_US
[4] LC_COLLATE=en_US LC_MONETARY=en_US LC_MESSAGES=en_US
[7] LC_PAPER=en_US LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
Thank you
Best
Ashutosh
[[alternative HTML version deleted]]
</pre>ashutosh srivastava2016-02-09T07:24:36missing value where TRUE/FALSE needed ERROR
http://comments.gmane.org/gmane.comp.lang.r.general/327682
<pre>Hi all,
I'm trying to write a function to implement a Metropolis-within-Gibbs
algorithm for two parameters.I'm including a naive version here so as to be
able to spot the error I got. So I first generate the vectors, X and R,
that will help to start the algorithm using (for example):
n=8; m=5; p=0.1; t=0.9 ; JH=10;
R <- numeric(m)
W <- numeric(m)
V <- numeric(m)
U <- numeric(m)
X <- numeric(m)
Bay.alpha<- numeric (JH)
Bay.beta<- numeric (JH)
Bay.Surv <- numeric (JH)
hyp=c(3,15,6,22.5)
theta<-c(0.2,2)
alpha.curr<-theta[1]
beta.curr<- theta[2]
R[1]<-rbinom(1, n-m, p)
for (i in 2:m-1) {
R[i]<-rbinom(1,n-m-sum(R[1:i-1]),p)
}
R[m]<-n-m-sum(R[1:m-1])
W<-runif(m, min = 0, max = 1)
for (i in 1:m){
V[i]<-W[i]^(1/(i+sum(R[(m-i+1):m])))
}
for (i in 1:m){
U[i]<- 1- prod(V[(m-i+1):m])
}
for (i in 1:m){
X[i]<- ((-1/theta[1])*log(1-U[i]))^(1/theta[2])
}
Then, I defined three functions 1- alpha.update() for updating alpha (Gibbs
step) 2- bettarg(), for the target distribution of beta </pre>Maram SAlem2016-02-09T13:01:51XGBoost continuos outcome case --- reg:linear in R
http://comments.gmane.org/gmane.comp.lang.r.general/327679
<pre>Hi,
While learning how to implement XGBoost in R I came across below case and want to know how to go about it.
Outcome variable: continous
independent features: mix of categorical and continuous
nrow(train_set): 8523
Since, XGBoost natively supports only numeric features, I applied one hot encoding on the training data set:
target <- train_set$Outlet_sales
sparsed_train_set <- sparse.model.matrix(~.-1, data=train_set)
nrow(sparsed_train_set) : 4526 #As expected, the row count is reduced.
Note: The target variable is continuous and has as many rows as in train_set i.e 8523, before one hot encoding is applied.
# To build mode:
bst <- xgboost(data = sparsed_train_set, label = target, max.depth = 4,
eta = 1, nthread = 4, nround = 50, objective=reg:linear)
# Above execution would fail as
My questions:
- How should I handle above disparity between sparsed training data and label while building the model ?
- How should I use XGBoost to perform regression where outcome is continuous ? Most </pre>Sandeep Rana2016-02-09T06:18:23Markovchain and Sensitivity analysis
http://comments.gmane.org/gmane.comp.lang.r.general/327678
<pre>I'm trying to develop a markov chain transition matrix to simulate an
infectious disease model. I've got a much larger matrix that I'm working
with but here's the code for a toy version of the model:
library("markovchain")
byRow <- TRUE
#Parameters
pop <- 1000
b1 <- 0.0000095
b2 <- 0.0000048
b3 <- 0.0000097
u1 <- 0.046
u2 <- 0.05
c <- 0.91
cf <- 0.25
e <- 0.1014
vb <- b3*e
s1s2 <- 1-b1-u1
s2s3 <- 1-b2-c-u2
s3e <- 1-b3
ir <- 1-cf
ve <- 1-vb
toyModel <- new("markovchain", states = c("birth", "Susceptible1",
"Susceptible2", "Susceptible3", "Infected", "Vaccinated", "Recovered",
"Exit"),
transitionMatrix = matrix(data = c(1, (pop*0.024), 0, 0, 0, 0, 0,
-(pop*0.024),
0, 0, s1s2, 0, b1, 0, 0, u1,
0, 0, 0, s2s3, b2, c, 0, u2,
0, 0, 0, 0, b3, 0, 0, s3e,
0, 0, 0, 0, 0, 0, ir, cf,
0, 0, 0, 0, vb, 0, 0, ve,
0, 0, 0, 0, 0, 0, 0.5, 0.5,
0, 0, 0, 0, 0, 0, 0, 1), byrow = byRow, nrow = 8),
name = "Toy")
initial <- c(1, 0, 0, 0, 0, 0, 0, 0)
after100 <- initial * (toyModel^ 100)
The issue is that the current variable valu</pre>Kristin Bornstein2016-02-08T22:45:38calling plot
http://comments.gmane.org/gmane.comp.lang.r.general/327674
<pre>I'm getting an interesting error:
> plotxy <- function(x, ...){
+ plot(x, ...)
+ }
> XY <- data.frame(x1=1:3, y1=4:6)
> plotxy(y1~x1, XY, xlim=c(0, max(x1)))
Show Traceback
Rerun with Debug
Error in eval(expr, envir, enclos) : object 'x1' not found
The following work:
plotxy(y1~x1, XY)
plot(y1~x1, XY, xlim=c(0, max(x1)))
Within "plotxy", R can't find "x1" to compute "xlim". Is there a
way I can make x1 available to xlim?
Thanks,
Spencer
</pre>Spencer Graves2016-02-09T03:17:57Compare correlation coefficients between samples
http://comments.gmane.org/gmane.comp.lang.r.general/327671
<pre>Dear All,
I have a dataframe of 1000 rows and 4 columns. Each row represents a pair
of vectors (in my case a pair of genes) while the columns represent the
following
estimate.A: spearman correlation coefficient of gene[i] and gene[j]
expression across 120 samples from cancer type A
prob.A: Probability value associated with the estimate.A as inferred from
cor.test
estimate.B: spearman correlation coefficient of gene[i] and gene[j]
expression across 48 samples from cancer type B
prob.B: Probability value associated with the estimate.A as inferred from
cor.test
To sum up the data.frame will look like
S<-data.frame(estimate.A=runif(1000,-1,1),prob.A=runif(1000,0,1),estimate.B=runif(1000,-1,1),prob.B=runif(1000,0,1))
I want to calculate the conditional probability for all the pair of genes
showing negative correlation in B (at p.value < 0.01) when they show a
strong negative correlation in A (again at p.value < 0.01). I was thinking
in the lines of using the "prob" package to estimate the conditional
probab</pre>swaraj basu2016-02-08T22:15:37Help in meta-analysis (URGENT please)
http://comments.gmane.org/gmane.comp.lang.r.general/327669
<pre>Dear all,
I’m conducting a met analysis and I usually use Revman, bur as I’m trying to use R more and more, I would like to conduct the met analysis here, in R (R-studio).
One off my problems, I think, is that:
1st. it’s the first time :)
2. I only have data for 1 arm as you can see on the data that follows.
ARTIGOqtttqctcPersonal Notesqt2
Giuliani M. (2014)--15151862only MSM.347
Diaz A. (2015)----only MSM (n=3081)2499
Niedźwiedzka-Stadnik M. (2015)8281098--326
Hoenig M. (2015)--8506-MSM (n=8925)419
Wu H. (2015)58145-16713n=1689287
Pan X. (2015)--only MSM (n=1316)-
Ma Q. (2015)--only MSM (n=424)-
Op de Coul E. (2015)8596-HIV-infected patients (n=20965)12369
Liu G. (2015)--1003 (?)-only MSM (n=1041) - some converted to HIV+ during the study-
Hoenigl M. (2015)--only MSM (n=8935) analysis HIV tests repetitions-
Moller L. M. (2015)--469-only MSM (N=561)92
watkins (2015)----only MSM (n=1154) only analysis believes concerning t</pre>Rosa Oliveira2016-02-08T18:40:20syntax for nested random factors in lme
http://comments.gmane.org/gmane.comp.lang.r.general/327668
<pre>Hi,
I've been taught that if I want to nest random factor A into B in an lme
model, the syntax is as follows: lme(x~y+B,random=~1|B/A).
In the case of my data, matters seem to be complicated by the fact that B
is a categorical variable with only 2 levels. When I run the lme with the
above syntax, I obtain an NaN p value for B as a fixed factor in the
model. When
I rewrite the random factor as random=~1|A/B, I obtain a p value.
Is the correct format for a nested random factor indeed B/A in this case?
Is it incorrect to write it as A/B?
Please let me know if you would like additional information, such as
observations and output.
Kathleen
[[alternative HTML version deleted]]
</pre>Kathleen Côté2016-02-08T18:41:31Dates and missing values
http://comments.gmane.org/gmane.comp.lang.r.general/327665
<pre>I have a data frame with dates as integers:
> summary(persons[, c("foddat", "doddat")])
foddat doddat
Min. :16790000 Min. :18000000
1st Qu.:18760904 1st Qu.:18810924
Median :19030426 Median :19091227
Mean :18946659 Mean :19027233
3rd Qu.:19220911 3rd Qu.:19310526
Max. :19660124 Max. :19691228
NA's :624 NA's :207570
After converting the dates to Date format ('as.Date') I get:
> summary(per[, c("foddat", "doddat")])
foddat doddat
Min. :1679-07-01 Min. :1800-01-26
1st Qu.:1876-09-04 1st Qu.:1881-09-24
Median :1903-04-26 Median :1909-12-27
Mean :1895-02-04 Mean :1903-02-22
3rd Qu.:1922-09-10 3rd Qu.:1931-05-26
Max. :1966-01-24 Max. :1969-12-28
My question is: Why are the numbers of missing values not printed in the
second case? 'is.na' gives the correct (same) numbers.
Can I somehow force 'summary' to print NA's? I found no clues in the
documentation.
> sessionInfo()
R version 3.2.</pre>Göran Broström2016-02-08T17:26:20How to extract same columns from identical dataframes in a list?
http://comments.gmane.org/gmane.comp.lang.r.general/327658
<pre>Hello,
I have a list of 7 data frames, each data frame having 24 rows (hour of
the day) and 5 columns (weeks) with a total of 5 x 24 values
I would like to combine all 7 columns of week 1 (and 2 ...) in a
separate data frame for hourly calculations, e.g.
In some way sapply (lapply) works, but I cannot directly select columns
of the original data frames in the list. As a workaround I have to
select a range of values:
Values 1:24 give the first column, 25:48 the second and so on.
Is there an easier / more direct way to select for specific columns
instead of selecting a range of values, avoiding loops?
Cheers,
Wolfgang
</pre>Wolfgang Waser2016-02-08T13:33:21Search EngineSearch the mailing list at Gmanequery
http://search.gmane.org/?group=$group=gmane.comp.lang.r.general