<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.lang.r.sequencing">
    <title>gmane.comp.lang.r.sequencing</title>
    <link>http://blog.gmane.org/gmane.comp.lang.r.sequencing</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2271"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2270"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2267"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2265"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2259"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2258"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2255"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2250"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2245"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2244"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2242"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2238"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2233"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2227"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2224"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2223"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2218"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2213"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2212"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2206"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2271">
    <title>bioc-sig-sequencing list will be removed</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2271</link>
    <description>&lt;pre&gt;Hi all,

We will be closing the bioc-sig-sequencing email list. The reasons are
because sequencing is now a mainstream part of the Bioconductor
project, posts to one list are often relevant to both, new users are
confused by the appropriate list for posting, and a large fraction of
bioc-sig-sequencing subscribers also subscribe to the main
Bioconductor mailing list.

The list archive will continue to be available at

 https://stat.ethz.ch/pipermail/bioc-sig-sequencing/

For continued support, you may use the Bioconductor mailing list

 http://bioconductor.org/help/mailing-list/

including a new feature allowing questions posted as a guest

 http://bioconductor.org/help/mailing-list/mailform/


The bioc-sig-sequencing list will be removed on Tuesday, October 4th.
&lt;/pre&gt;</description>
    <dc:creator>Dan Tenenbaum</dc:creator>
    <dc:date>2011-09-30T19:51:11</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2270">
    <title>Posting to the Bioconductor list without subscribing</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2270</link>
    <description>&lt;pre&gt;Hi all,

We've added a new feature to the Bioconductor web site: a form through which
you can post to the Bioconductor email list without first subscribing to it.
The form contains a captcha to protect against spammers.

The form is here:
http://bioconductor.org/help/mailing-list/mailform/

and it can also be reached through the Mailing Lists link on the main page
at http://bioconductor.org.

Enjoy,
Dan

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>Dan Tenenbaum</dc:creator>
    <dc:date>2011-09-30T19:12:44</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2267">
    <title>Another ScanBamParam suggestion</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2267</link>
    <description>&lt;pre&gt;Following Janet's example, I would also like to propose an upgrade to
ScanBamParam:

It would be great if we could tell ScanBamPram that we want to load
only the reads that passed the vendor's quality filter.

In other words, the functionality I am suggesting is analogous to the
filter in readAligned() from the ShortRead library.


With the new release of Illumina sequencing reagents (version 3) you
get 200 million reads per lane from the HiSeq 2000. In my view, with
samples that big becoming popular, any investment in "read in"
efficiency is a good investment. I would be happy to provide a sample
BAM for those interested in addressing this suggestion.

It is also my humble opinion that we should start considering
parallelisation for reading in. I hope that I am not just wishing too
much.

Thank you,

Ivan
&lt;/pre&gt;</description>
    <dc:creator>Ivan Gregoretti</dc:creator>
    <dc:date>2011-09-30T14:48:32</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2265">
    <title>ScanBamParam suggestion</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2265</link>
    <description>&lt;pre&gt;Hi,  

I have a suggestion to make ScanBamParam easier to use for coding amateurs like myself (I'm still sometimes confused with the many ways to encode genomic regions):

Is it easy/possible to change bamWhich function to accept GRanges objects, rather than requiring RangesList?  See below...

thanks, as usual,

Janet



############
library(Rsamtools)
myGR &amp;lt;- GRanges(seqnames="chr1",ranges=IRanges(start=1,end=100))

# I can use GRanges as the "which" argument if I do it when I create the ScanBamParam object
myparams1 &amp;lt;- ScanBamParam(which=myGR)

#but not if I try to set later
myparams2 &amp;lt;- ScanBamParam()
bamWhich(myparams2) &amp;lt;- myGR
### Error in checkSlotAssignment(object, name, value) : 
###   assignment of an object of class "GRanges" is not valid for slot "which" 
### in an object of class "ScanBamParam"; is(value, "RangesList") is not TRUE

## it's OK, though coercion does work.  
bamWhich(myparams2) &amp;lt;- as(myGR,"RangesList")

sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: i386-apple-darwin9.8.0/i3&lt;/pre&gt;</description>
    <dc:creator>Janet Young</dc:creator>
    <dc:date>2011-09-30T00:26:24</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2259">
    <title>a wired problem</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2259</link>
    <description>&lt;pre&gt;dear all:
     Using vcountPattern, i found some matched sequences.
but those are not similar to the pattern.
     see such coding

rm(list=ls())
reads &amp;lt;- readFastq(fastqfile);#downloaded from
http://biocluster.ucr.edu/~tbackman/query.fastq
seqs &amp;lt;- sread(reads);
PCR2rc&amp;lt;-DNAString("AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAAA")
result &amp;lt;- vcountPattern(PCR2rc, seqs, max.mismatch=1, min.mismatch=0,
with.indels=TRUE, algorithm="indels")
reads &amp;lt;- reads[result]
seqs &amp;lt;- sread(reads)
sum(result)
     then using countPattern, i found they are really not match

subject1 = "GTTGGTGCAAACATTAGTTCTTCTGTTGGTGCAACCTTTG"
result &amp;lt;- countPattern(PCR2rc, subject1, max.mismatch=1, min.mismatch=0,
with.indels=TRUE)
[1] 0

shan gao

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-28T20:14:24</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2258">
    <title>coverage vector longer than covered area (ScanBamand IRanges)?</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2258</link>
    <description>&lt;pre&gt;Hi there,
may be I am missing something obvious in my code.
I extracted information from a bam file on 11 regions.
when I calculate the coverage on one of these regions, I get a vector which is much longer than the number of bases of the region itself.

I tried several times to replicate this problem on a small IRanges object but - of course - everything was ok.
Anyone has an idea?

that's what I've done:

   chr     start       end
1    1 195915909 197867662
2    2   1199920   2234892 [...]

scanned my bam within those regions
which=GRanges(regions$chr,IRanges(regions$start,regions$end))
ggf&amp;lt;-scanBam("s_7_1_sequence.txt.novo.rmdup.bam_sorted.bam", 
             param=ScanBamParam(which=which,
                                what = c("pos", "qwidth"),
                                flag =scanBamFlag(isUnmappedQuery =FALSE)))

created a summary function, as suggested in the IRanges vignette
summaryFunction&amp;lt;-function(seqname,bamfile,...){
  x&amp;lt;-bamfile[[seqname]]
  coverage(IRanges(x[["pos"]],width=x[["qwidth"&lt;/pre&gt;</description>
    <dc:creator>Francesco Lescai</dc:creator>
    <dc:date>2011-09-27T19:53:45</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2255">
    <title>what is</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2255</link>
    <description>&lt;pre&gt;hello every one,

i have such coding

"ATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAATATT"

"AATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG"
with.indels=TRUE)
  Views on a 81-letter BString subject
subject:
ATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAATATT
views: NONE
max.Lmismatch=0,with.Lindels=TRUE)
[1] "AAAAAAAAAAAATATT"

i already allow the indels,but why matchPattern cannot find the pattern in
subject
what does with.indels mean?
i am confused
thx
shan gao

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-27T19:23:57</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2250">
    <title>Read big fastq files in chunks</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2250</link>
    <description>&lt;pre&gt;Hi,
I should filter a big fastq file (HiSeq ca. &amp;gt;30'000'000, which does not
fit in memory) for low qualities e.g. to many N in the sequence.

I plan to read the file in chunks, filter the ShortReads and write them
into a new file.
This is no problem for fasta files, see the example:

filename &amp;lt;- "data/DmS2DRSC_RNA_seqs_subsample.fa"
eof &amp;lt;- FALSE
append &amp;lt;- FALSE
cycle &amp;lt;- 1L
while(!eof){
  chunk &amp;lt;- readFasta(filename, nrec=nrec, skip=(cycle-1)*nrec)
  nFilt &amp;lt;- nFilter(5)
  writeFasta(chunk[nFilt(chunk)],
             sprintf("%s/filtered_%s", 
                     dirname(filename), basename(filename)),
             append=append)
  append &amp;lt;- TRUE
  cycle &amp;lt;- cycle + 1
  if(length(chunk) == 0L)
      eof &amp;lt;- TRUE
}

If I try to do the same for fastq file e.g.

filename &amp;lt;- "data/test_data.fq"
eof &amp;lt;- FALSE
mode &amp;lt;- "w"
cycle &amp;lt;- 1L
while(!eof){
  chunk &amp;lt;- readFastq(filename)
  nFilt &amp;lt;- nFilter(5)
  writeFastq(chunk[nFilt(chunk)],
             sprintf("%s/filtered_%s", 
                     dirname(filename), basenam&lt;/pre&gt;</description>
    <dc:creator>Anita Lerch</dc:creator>
    <dc:date>2011-09-27T09:39:05</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2245">
    <title>question about trimLRPatterns</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2245</link>
    <description>&lt;pre&gt;dear harris:
thank you very much for your previous help, but i am still confused by such
problems:
[1] why does the second section of coding can not work,but the first can
subject = "TTTACGT"
Lpattern = "TTTAACGT"
trimLRPatterns(Lpattern = Lpattern, subject = subject,
max.Lmismatch=1,with.Lindels=TRUE)
subject = "TGCATTT"
Rpattern = "TGCAATTT"
trimLRPatterns(Rpattern = Rpattern, subject = subject,
max.Rmismatch=1,with.Rindels=TRUE)
Error in solveUserSEW(width(x), start = start, end = end, width = width) :
  solving row 1: 'allow.nonnarrowing' is FALSE and the supplied start (0) is
&amp;lt; 1

[2] how can i see the code of such functions: which.isMatchingStartingAt,
rev, normargPattern
which are called by Biostrings:::.computeTrimEnd
showMethods("which.isMatchingStartingAt") can not work
[3] max.Rmismatch=0.1 will be replaced by 0.1*nchar(Rpattern) and then by
(-1,-1,...as.integer(0.1*nchar(Rpattern)))
to better control each try, can i use max.Rmismatch=0.1*(1:nchar(Rpattern))

good luck
shan gao

[[alternative HTM&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-26T20:51:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2244">
    <title>remove adaptor before remove barcode</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2244</link>
    <description>&lt;pre&gt;dear all:
 i found a problem trimLRPatterns cannot allow this situation
that is the Rpattern has a left shift from subject sequence, see below

subject
 ATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAATATT

Rpattern
AATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG

so i change the code to keep the 5' barcode GCATT, so the pattern can be
recognized


subject
 GCATTATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAATATT

Rpattern
           AATCGAGATCGGAAGAGCGGTTCAGCAGGAATGCCGAGACCGATCTCGTATGCCGTCTTCTGCTTG

after remove 3' adaptor, i can remove the 5' barcode

any ideas?

shan gao

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-26T13:37:37</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2242">
    <title>how to write a fasta file</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2242</link>
    <description>&lt;pre&gt;hi every one:

i read fastq file and save them as fasta
but falied

fastqfile="unmapped"
library(ShortRead)
reads &amp;lt;- readFastq(fastqfile)
seqs &amp;lt;- sread(reads)
ids &amp;lt;- id(reads)

writeFasta(seqs, file="unmapped.fa")
Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function "writeFasta", for
signature "DNAStringSet"

writeFASTA(seqs, file="unmapped.fa")
Error in for (rec in x) { : invalid for() loop sequence

what is the problem
thank you

shan gao

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-25T23:25:23</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2238">
    <title>Reads in 3'utr</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2238</link>
    <description>&lt;pre&gt;Hi everyone,
I am doing NGS analysis using bam files.I have counted reads in 3'utr region using 
utr=threeUTRsByTranscript(txdb,use.names=FALSE)
countsUTR &amp;lt;- countOverlaps(utr,reads)
I have got the transcript level counts from this.How can I get the gene level counts??It might sound silly but Does anybody have an idea on what type of anaylses we can do from this countsUTR ?
Thanks,Rohan
[[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing-0bNBQ1PAWB4BXFe83j6qeQ&amp;lt; at &amp;gt;public.gmane.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
&lt;/pre&gt;</description>
    <dc:creator>rohan bareja</dc:creator>
    <dc:date>2011-09-20T19:13:28</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2233">
    <title>first remove low quality or adaptor</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2233</link>
    <description>&lt;pre&gt;i think remove low quality is better
because the low quality position on the 3' end will lead to many mismatch.
so you could not set the parameter of mismatch well

after remove low quality 3' end, it is easy for good match between 3' end
with adaptor

but i met another problem, when i trimmed the adaptor on the data without
low quality 3' end

the lenght of reads are not uniform, the max is 40
when i use a 40 long adaptor
PCR2&amp;lt;- DNAString("AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAA")
with.Rindels=T)
Error in solveUserSEW(width(x), start = start, end = end, width = width) :
  solving row 2934: 'allow.nonnarrowing' is FALSE and the supplied start (0)
is &amp;lt; 1

i think maybe it is due to many read lenght is below than 40
but when i run this on linux, it is ok

the r version:

sessionInfo()
R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-mingw32/x64 (64-bit)
sessionInfo()
R version 2.11.1 (2010-05-31)
x86_64-unknown-linux-gnu

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-21T01:33:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2227">
    <title>how to subset reads or seqs</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2227</link>
    <description>&lt;pre&gt;dear all
     i have ids4 which is subset of id(reads), i want to use ids4
to subset seqs or reads, but still canon work
such is the coding


fastqfile="query.fastq"
library(ShortRead)
reads &amp;lt;- readFastq(fastqfile)
seqs &amp;lt;- sread(reads)
names(seqs)&amp;lt;- id(reads)
ids4&amp;lt;- read.table("ids4.txt")
seqs[ids4]

thx
gaoshan

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-19T18:35:28</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2224">
    <title>the question about fastx_clipper</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2224</link>
    <description>&lt;pre&gt;the question about fastx_clipper
i want to compare the result of fastx_clipper with my R coding
the query.fastq file include 25000 entries of data,
1. fastx_clipper -a AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAAA -l 0 -i
query.fastq -o fastx_output -Q33
output 24334 entries
2. fastx_clipper -a AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAAA -c -l 0 -i
query.fastq -o fastx_output -Q33
output 2974 entries
3. fastx_clipper -a AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAAA -C -l 0 -i
query.fastq -o fastx_output -Q33
output 21360 entries
4. fastx_clipper -a AGATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGAAACAAA -k -l 0 -i
query.fastq -o fastx_output -Q33
output 364 entries

so 2+3=ï¼, but 2+3+4 Not=25000
so I compared 25000-24334= 666 with 4, found 302 not in 4
when i run the same parameter with 4 on those 666 entries, i can only output
12 entries

when i trimmed the same adaptor on 666 entries, the output is only 12
entries.

who can tell me why?

[[alternative HTML version deleted]]

_______________________________________________
B&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-19T17:55:17</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2223">
    <title>large BAM files and large bed files</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2223</link>
    <description>&lt;pre&gt;Hello,

I am experiencing a problem regarding the load in memory of bed files of
30 GB. my function read.table unleash the error : Error in unique(x) :
length xxxxxx is too large for hashing.

this is generated by the function MKsetup of the unique.c file. Even by
increasing by 10 000x the value, the error persists. I believe the
function pushes more data in ram, but I am not sure this is the good way
to focus on. 

Ultimately, I would like to produce a GenomeData object from either a
BAM file or a bed file.

has someone ever worked with very very big BAM files (about 30 GB)

thanks

Rene paradis
&lt;/pre&gt;</description>
    <dc:creator>Rene Paradis</dc:creator>
    <dc:date>2011-09-16T13:17:39</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2218">
    <title>how to sort reads by ids</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2218</link>
    <description>&lt;pre&gt;here is the coding

rm(list=ls())#
fastqfile="query.fastq"
library(ShortRead)
reads &amp;lt;- readFastq(fastqfile);#
ids&amp;lt;- id(reads); #
seqs &amp;lt;- sread(reads); #

how to sort reads by ids?
thx
shan gao

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-18T16:46:35</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2213">
    <title>large BAM files and large BED files</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2213</link>
    <description>&lt;pre&gt;Hello,

I am experiencing a problem regarding the load in memory of bed files of
30 GB. my function read.table unleash the error : Error in unique(x) :
length xxxxxx is too large for hashing.

this is generated by the function MKsetup of the unique.c file. Even by
increasing by 10 000x the value, the error persists. I believe the
function pushes more data in ram, but I am not sure this is the good way
to focus on. 

Ultimately, I would like to produce a GenomeData object from either a
BAM file or a bed file.

has someone ever worked with very very big BAM files (about 30 GB)

thanks

Rene paradis
&lt;/pre&gt;</description>
    <dc:creator>Rene Paradis</dc:creator>
    <dc:date>2011-09-16T20:44:49</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2212">
    <title>edgeR tagwise estimates not converging to common estimate with large prior.n value</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2212</link>
    <description>&lt;pre&gt;Hi,

Thanks in advance for any help. I have the latest R software (2.13.1) and
edgeR software (2.8.4). I'm running into a problem where I estimate a common
dispersion parameter of 0.0001 and when I subsequently estimate tagwise
dispersions using the default prior.n = 10, the summary statistics are

 Min.  1st Qu.  Median    Mean    3rd Qu.    Max.
0.001  0.001      0.001     0.001     0.001      0.022

ie, all estimates are 10 times larger than the common dispersion estimate.
Since the method is supposed to shrink toward the common value this seems a
little surprising. When I increase prior.n to a large number I expect the
tagwise estimates to all converge to the common dispersion, but as you might
guess from the table above it converges to 0.001 = 10*common.

The data comes from the bioconductor package "yeastRNASeq" and it appears
from the description of the data that the two samples in each group are
actually from sequencing the same extraction of mRNA, ie not biological and
not even really technical repl&lt;/pre&gt;</description>
    <dc:creator>Sean Ruddy</dc:creator>
    <dc:date>2011-09-16T01:03:28</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2206">
    <title>a question about write file</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2206</link>
    <description>&lt;pre&gt;the pairwiseAlignment() function return a
PairwiseAlignedXStringSet

how to export it to a file

thx
shan gao

[[alternative HTML version deleted]]
&lt;/pre&gt;</description>
    <dc:creator>wang peter</dc:creator>
    <dc:date>2011-09-13T20:59:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.lang.r.sequencing/2204">
    <title>Course: Introduction to R / Bioconductor for Sequence Analysis, October 17-18</title>
    <link>http://comments.gmane.org/gmane.comp.lang.r.sequencing/2204</link>
    <description>&lt;pre&gt;The Bioconductor team will offer an "Introduction to R / Bioconductor 
for Sequence Analysis" course on October 17/18 here in Seattle.

This course introduces participants to the use of R and Bioconductor for 
the analysis of high throughput sequence data. Participants are expected 
to have basic familiarity with R, and with relevant biological domains. 
Participants will gain a better understanding of how R works, and of the 
ways in which Bioconductor represents and manipulates sequence data. 
Participants will gain experience in simple work flows for the analysis 
of common next-generation sequence experiments.

See you there!

Martin
&lt;/pre&gt;</description>
    <dc:creator>Martin Morgan</dc:creator>
    <dc:date>2011-09-13T16:03:36</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.lang.r.sequencing">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.lang.r.sequencing</link>
  </textinput>
</rdf:RDF>

