gmane.comp.lang.r.general
http://blog.gmane.org/gmane.comp.lang.r.general
hourly11901-01-01T00:00+00:00Gmanehttp://gmane.org/img/gmane-25t.png
http://gmane.org
Eager To Learn And Contribute!
http://comments.gmane.org/gmane.comp.lang.r.general/312090
<pre>Hi,
I am Kartik Singh , an open source enthusiast who is aspiring to be a Data
Scientist .
Getting and Cleaning of data and its Analysis through R is something that
fascinates me.
I am not new to the field of Computer Science or Statistics but R is
something i have started to learn and explore recently .I have just
completed an online course on R programming on Coursera by John Hopkins
University with distinction .
Some of my skill sets include programming in c , c++ , basic knowledge of
RDBMS and programming in sql*plus/MySQL
I have read all the introductory manuals as well as researched a lot on the
R Programming language and was overwhelmed by the power of the R library
and wish to contribute to the R project in any way possible.
It would be very kind of you If you could guide me in anyway possible
because i am very eager to learn and contribute .
Thank You.
[[alternative HTML version deleted]]
</pre>Kartik Singh2014-07-31T03:35:51separate numbers from chars in a string
http://comments.gmane.org/gmane.comp.lang.r.general/312073
<pre>Hi,
If I have a string of consecutive chars followed by consecutive numbers and then chars, like "absdfds0213451ab", how to separate the consecutive chars from consecutive numbers?
grep doesn't seem to be helpful
grep("[a-z]","absdfds0213451ab", ignore.case=T)
[1] 1
grep("[0-9]","absdfds0213451ab", ignore.case=T)
[1] 1
Thanks
Carol
[[alternative HTML version deleted]]
</pre>carol white2014-07-30T20:13:52a quick list mode question
http://comments.gmane.org/gmane.comp.lang.r.general/312071
<pre>Hello yet again
I was looking at the class and the mode of a list. For the class of a
list, of course I got a list, but for the mode, I got a function.
Why do you get a function, please?
Thanks,
Erin
</pre>Erin Hodgess2014-07-30T20:07:18binary to R object
http://comments.gmane.org/gmane.comp.lang.r.general/312070
<pre>Hello,
I have stored R objects as hexadecimals in a mysql database, I then usually transform them to binary and then save it into a file. E.g.
hex <- getBlob #gets blob from database as hexadecimal
binary <- transformToBinary(hex) #moves from hex to binary
I would usually save them into a file as follows:
writeBin(object=binary,con="fileName.txt")
Then eventually if I want to use it in R I just load it:
load("fileName.txt")
However, how can I go from the binary directly into an Robject without writing it to a file?
say:
myObject <- unknownFunction(binary)
I hope this is enough detail. Any help appreciated.
In retrospect, I could have probably used serialize/unserialize and store a serialize string in mysql and unserialize when needed ( or write to a file if necessary), but I didn't know of those when I did this.
Thanks in advance,
Ramiro
[[alternative HTML version deleted]]
</pre>Ramiro Barrantes2014-07-30T19:56:53xtable problems with xts objects
http://comments.gmane.org/gmane.comp.lang.r.general/312069
<pre>Hi, I'm having trouble getting xtable to print a simple xts object. When the frequency is 1800 seconds I get an error. However when frequency(x) is reported as 1 it works. I have included example below.
Another question is how can I have the the time index printed in place of the row number?
Thanks for your help,
--
Stergios Marinopoulos
library(xts)
library(xtable)
# This does not work. The indicies are 1800 seconds apart.
x = xts(x=rep(pi, 6), order.by=as.POSIXlt( "2014-01-01 06:30:00" ) + 1800 * 0:5 )
print(xtable(x))
print(frequency(x)) # [1] 0.0005555556
# This is the error message:
# Error in rep(NA, start(x)[2] - 1) : invalid 'times' argument
# It works when I make the indicies only 1 second apart.
x = xts(rep(pi, 6), as.POSIXlt( "1970-01-01 00:00:00" ) + 0:5 )
print(xtable(x))
print(frequency(x)) # [1] 1
# Here is a daily data example that works.
library(quantmod)
AAPL = getSymbols("AAPL", auto.assign=FALSE, from="2014-01-01", to="2014-01-10")
print(xtable(AAPL), in</pre>Stergios Marinopoulos2014-07-30T17:01:56is.na() == TRUE for POSIXlt time / date of "2014-03-09 02:00:00"
http://comments.gmane.org/gmane.comp.lang.r.general/312054
<pre>"I'm so confused!" Why does is.na() report TRUE for a POSIXlt date &
time of 2014-03-09 02:00:00 ?
[1] "2014-03-09 02:00:00"
[1] TRUE
[1] NA
structure(list(sec = 0, min = 0L, hour = 2, mday = 9L, mon = 2L,
year = 114L, wday = 0L, yday = 67L, isdst = 0L, zone = "",
gmtoff = NA_integer_), .Names = c("sec", "min", "hour", "mday",
"mon", "year", "wday", "yday", "isdst", "zone", "gmtoff"), class = c("POSIXlt",
"POSIXt"))
POSIXlt[1:1], format: "2014-03-09 02:00:00"
</pre>John McKown2014-07-30T17:08:58DATA SUMMARIZING and REPORTING
http://comments.gmane.org/gmane.comp.lang.r.general/312046
<pre>Hi R-helpers,
I have dataframe like
ID_CASE YEAR_MTH ATT_1 A1 A2
A3 CB26A 201302 1 146 42 74 CB26A 201302 0 140 50 77 CB26A 201303 0 128
36 77 CB26A 201304 1 146 36 72 CB26A 201305 1 134 36 80 CB26A 201305 0
148 30 80 CB26A 201306 0 134 20 72 CB26A 201307 1 125 48 79 CB26A 201309
0 122 44 74 CB26A 201310 1 126 37 72 CB26A 201310 1 107 43 75
I want a final dataframe which will look like
ID_CASE Period No.ofChange %Paid CB26A 201302-2013042 0.414365
CB26A 201303-201305 2 0.445245 CB26A 201304-201306 1 0.444444 CB26A
201305-201307 2 0.460741 CB26A 201306-201308 1 0.461774 CB26A
201307-201309 1 0.451327 CB26A 201308-201310 1 0.461378
where,
Period = a time period of 3 months which is shifted by 1 month subsequently
No.ofChange = number of time ATT_1 has changed values in this period
%Paid = sum(A3)/(sum(A1)+sum(A2)) for this period
E.g. for Period=201302-201304,
%Paid = (74+77+77+72)/((146+140+128+146)+(42+50+36+36))
Period calculation shou</pre>Abhinaba Roy2014-07-30T12:46:06one more knitr question, please
http://comments.gmane.org/gmane.comp.lang.r.general/312044
<pre>Hello again:
Is there a way to put a page break inside of a code chunk, please?
Thanks,
Erin
</pre>Erin Hodgess2014-07-30T11:58:18ATTN: Urgent Guidance Needed on scraping tweets for last 10 years using TwitteR / search twitter function.
http://comments.gmane.org/gmane.comp.lang.r.general/312041
<pre>Hi
This is Abhishek and I am trying to look for tweets on 'Election' from
2000 to YTD. I have registered on twitter and performed a handshake
between the systems as well. Next I am trying to fetching tweets
chronologically using the below code:-
tweets1.list = searchTwitter('Election',lang="en",since='2000-07-01',
until='2014-07-30', cainfo="cacert.pem")
All I get in return is 26 line items between 27th - 28th of July only.
Can you please help me understand why it gives me so less number of
tweets, backdated by two days only & also if there is a an alternative
method of fetching tweets over the last ten years, at the minimum,
categorized by date ?
Many thanks in advance for your guidance. This is an urgent request
and hence requesting your immediate assistance.
Best
Abhishek
</pre>Abhishek Dutta2014-07-30T08:58:47Map with no political border
http://comments.gmane.org/gmane.comp.lang.r.general/312038
<pre>Hi,
I would like to plot a map without political borders, but not black... I
am mainly using the maps package, and wanted to know if it was possible
with this one, or if I should use something else, like shape file.
If I do
library(maps)
map(fill=T)
I have a map of the world filled in black. I would like to do the same but
with another color and with no political borders (only filled land areas).
If I do
map(fill=T, col=³blue²)
the borders are automatically added, and I cannot find a way to remove
them as the parameters boundary or interior are ignored if fill is true.
Thanks a lot
Julien
</pre>Julien Million2014-07-30T09:18:13Count number of change in a specified time interval
http://comments.gmane.org/gmane.comp.lang.r.general/312036
<pre>Dear R-helpers,
I want to count the number of times ATT_1 has changed in a period of 3
months(can be 4months) from the first YEAR_MTH entry for a CASE_ID. So if
for a CASE_ID we have data only for two distinct YEAR_MTH, then all the
entries should be considered, otherwise only the relevant entries will be
considered for calculation.
E.g. if the first YEAR_MTH entry is 201304 then get the number of changes
till 201307(inclusive), similarly if the first YEAR_MTH entry is 201302
then get the number of changes till 201305.
Dataset
CASE_ID YEAR_MTH ATT_1
CB26A 201302 1
CB26A 201302 0
CB26A 201302 0
CB26A 201303 1
CB26A 201303 1
CB26A 201304 0
CB26A 201305 1
CB26A 201305 0
CB26A 201306 1
CB27A 201304 0
CB27A 201304 0
CB27A 201305 1
CB27A 201306 1
CB27A 201306 0
CB27A 201307 0
CB27A 201308 1
The final dataset should look like
ID_CASE N</pre>Abhinaba Roy2014-07-30T07:08:24Out of example forecasting plot
http://comments.gmane.org/gmane.comp.lang.r.general/312035
<pre>
Dear R helping team,
I am trying to check if the model I fitted is the best fitted model from the data. I have over all 392 data, I want to fit the model in the first 382 data and do 10 step predictions, check whether my model fits the last 10 data well.
Here is the R code I wrote
da<-read.table(file.choose(),header=T)
y <- data.frame(time =
seq(as.Date('2007-01-07'), by = 'weeks', length = 392))
#produce a vector that can show the dates
of the exchange rates.#
ex<-da[,2]
mod <- list()
mod[["linear"]] <- linear(ex, m = 4)
mod[["setar"]] <- setar(ex, m = 4,
thDelay = 1)
mod[["lstar"]] <- lstar(ex, m = 4,
thDelay = 1)
mod[["nnetTs"]] <- nnetTs(ex, m = 4,
size = 3)
mod[["aar"]] <- aar(ex, m = 4)set.seed(10)mod.test <- list()ex.train <- window(ex, end = 380)ex.test <- window(ex, start = 381)mod.test[["linear"]] <- linear(ex.train, m = 4)mod.test[["setar"]] <- setar(ex.train, m = 4, thDelay = 1)mod.test[["lstar"]] <- lstar(ex.train, m = 4, thDelay = 1, trace = FALSE,control = list(maxit = 1e+05)</pre>张天添2014-07-29T22:50:44a knitr question
http://comments.gmane.org/gmane.comp.lang.r.general/312029
<pre>Hello!
When constructing code using knitr, is there a way to get the ">" prompt to
appear, please?
Everything else works great!!
Thank you!
Sincerely,
Erin
</pre>Erin Hodgess2014-07-29T22:22:56Post hoc comparissons
http://comments.gmane.org/gmane.comp.lang.r.general/312022
<pre>Hi Folks,
I have been using the TukeyHSD to conduct post hoc comparisons. The challenge is I have two interacting factors, one of which has 6 levels (cover types), and the other 2 (study areas). As a result interpreting the post hoc comparisons is difficult when you get a list of every possible combination (66 of them). In grad school I was using SAS and I recall most of the post hoc tests that were available there provided a nice summary with letters that made it easy to show which groups differed from one another within an ANOVA model that included a significant interaction. Is there a way to summarize my result in R that makes it easier to indicate significant post hoc comparisons graphically?
Thanks for your help,
Doug Reid
CNFER
Lakehead University
[[alternative HTML version deleted]]
</pre>Reid, Doug (MNR2014-07-29T19:57:29Help with splitting up values in a data set
http://comments.gmane.org/gmane.comp.lang.r.general/312020
<pre>Good day,
I have a data set from a MySQL database with a description field that I want to spilt up the values in order to compare the description of one record to the others. This will help me to identify any patterns with the data being recorded.
Your help will be gladly appreciated.
Gayon
</pre>Gayon Clarke2014-07-29T18:04:35Trouble with function nnetar
http://comments.gmane.org/gmane.comp.lang.r.general/312012
<pre> I was playing around with the nnetar function in the forecast package,
trying to generate a 3-year forecast (36 months), the R fit result is shown
below:
#Neural Nets Fitting and Forecast
Series: condataset$ConPcums97
Model: NNAR(13)
Call: nnetar(x = condataset$ConPcums97)
Average of 20 networks, each of which is
a 13-7-1 network with 106 weights
options were - linear output units
sigma^2 estimated as 5.34e+12
Now, R is sending me the following error message:
x<-forecast(fit, h=36)
Error in predict.nnet(X[[1L]], ...) : missing values in 'x'
Error in predict.nnet(X[[1L]], ...) : missing values in 'x'
Error in predict.nnet(X[[1L]], ...) : missing values in 'x'
I am attaching the same data I used to try the nnetar function. Can you
give me some guidance or tell me what could be done to generate the
forecast with this function?
Best regards,
Paul
</pre>Paul Bernal2014-07-29T16:35:44Dependency Injection & Inversion of Control for Data
http://comments.gmane.org/gmane.comp.lang.r.general/312011
<pre>Greetings,
New to R, coming from Java (Spring).
We have many different data sources (CSV's) for our analysis. Some of them
need preprocessing at the time of analysis - doing it earlier and saving
the resultant table doesn't make sense.
My code is getting tangled quickly as I try to read.csv my many data files
and source both the preprocessing stuff as well as my analysis code.
I'm hoping for a streamlined method of injecting the data/code needed into
my analysis code, instead of imperatively sorting everything out at the top
of my analysis code.
Googling "Dependency Injection R" and "Inversion of Control R" gave nothing
useful. Searching for "Dependency Management" brought me to the packrat
package, but that doesn't seem to have the injection element I'm looking
for (as I would expect from such a system).
Am I barking up the wrong tree? I can't imagine my problem is a new one.
How do you solve it?
Cheers,
Reed
[[alternative HTML version deleted]]
</pre>Reed Spool2014-07-29T16:02:31Copulas and spatial modeling
http://comments.gmane.org/gmane.comp.lang.r.general/312007
<pre>We are interested in using copulas for spatial modeling of environmental
data. We are new to R and new to copulas. Weâre looking for some guidance
on how to use copulas for this application. I've searched the r-help
archives and I not found the information I need. Can anyone point me to
information or examples of how to apply copulas to spatial modeling of
environmental data?
Thanks.
*Dave Leighton*
*HydroFocus, Inc.530-759-2484*
[[alternative HTML version deleted]]
</pre>Dave Leighton2014-07-29T15:24:59venn.diagram, error message: Incorrect number of elements
http://comments.gmane.org/gmane.comp.lang.r.general/312006
<pre>Hello, there,
I have 6 dataset and trying to draw a venn.diagram using R VennDiagram package, but got this error message. Could anybody figure out why?
Thanks
===
Loading required package: grid
V1
1 F_HO10000
2 F_HO10001
3 F_HO10002
4 F_HO10003
5 F_HO10004
6 F_HO10005
V1
1 F_HO1000
2 F_HO10000
3 F_HO10001
4 F_HO10002
5 F_HO10003
6 F_HO10004
V1
1 F_HO10000
2 F_HO10001
3 F_HO10002
4 F_HO10003
5 F_HO10004
6 F_HO10005
V1
1 F_HO1000
2 F_HO10000
3 F_HO10001
4 F_HO10002
5 F_HO10003
6 F_HO10004
V1
1 F_HO10001
2 F_HO10002
3 F_HO10004
4 F_HO10007
5 F_HO10008
6 F_HO10009
V1
1 F_HO1000
2 F_HO10000
3 F_HO10001
4 F_HO10002
5 F_HO10003
6 F_HO10004
Error: Incorrect number of elements.
[[alternative HTML version deleted]]
</pre>Fix Ace2014-07-29T15:11:09analyzing qualitative data sets
http://comments.gmane.org/gmane.comp.lang.r.general/312002
<pre>Hello list
I'm just beginning my PhD and am likely to be using lots of surveys in
my data collection, and am wanting to get my head around the ideas about
how best to approach the tasks in R.
The data sets I have collected so far for some preliminary practise with
are made up of the following survey data:
(1) 25 observations x 15 variables of dichotomous nominal (categorical)
data [basically, yes/ no responses with a couple of missing values]
(2) 25 obs x 14 var of ordinal rank data [5 item Likert-scale, with some
missing values], and
(3) 23 observations of free text, typically in the form of one sentence
or statement, and I will be using RQDA for that part.
So far, I have been able to piece together that I can use the Spearman
method of the wilcox.text for #2 (ordinal data), but have yet to find
anything that I can do for the nominal data. I was thinking of using
frequency tables, but I don't seem to be able to find out too much info
on it/ how to do that.
Anyway, I have three questions that</pre>Sun Shine2014-07-29T13:01:23outputting R loop to a csv file
http://comments.gmane.org/gmane.comp.lang.r.general/311997
<pre>Hello,
My name is Jenny Jiang and I am a Finance Honours research student from the University of New South Wales. Currently my research project involves the calculating of some network centrality measures in R by using a loop, however I am having some trouble outputting my loop results to a desired CSV format.
Basically what I am doing is that for each firm year, I will need to calculate four different measures based on director id and connected director id and output these to the CSV file. I have provided in the attachment the code that I used for the R loop and CSV outputting (main-6.R). Using an example CSV file (data example 2), the output result I get is as shown in measure1.csv. As shown in the output file, the results are really messy, where for each firm year, all director ids and each type of measure for all directors are displayed in one cell. However, the desired format of output that I would like is as shown in output data template.xlsx.
As a result, I was just wondering if you could be abl</pre>Jenny Jiang2014-07-29T01:47:38Search EngineSearch the mailing list at Gmanequery
http://search.gmane.org/?group=$group=gmane.comp.lang.r.general