<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl">
    <title>gmane.comp.python.opencl</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1492"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1491"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1490"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1489"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1488"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1487"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1486"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1485"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1484"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1483"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1482"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1481"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1480"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1479"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1478"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1477"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1476"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1475"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1474"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.opencl/1473"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1492">
    <title>Re: PyOpenCL on PyPy</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1492</link>
    <description>&lt;pre&gt;
Since many of the objects interact, we'd also have to build an interface
layer between the boost-python wrappers and the cffi ones--which I'd
rather avoid.

IOW, I'm not sure it makes sense to have both in the same tree. I'd say
let's start slow by just wrapping the bare necessities from Context,
CommandQueue, Buffer, Program, and Kernel. (while taking notes on what
was left out) That should give us a proof-of-concept idea of how
difficult (and successful) this is all going to be.


Coverage is good. If it passes, it's in releasable state.

Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-20T23:10:48</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1491">
    <title>Re: PyOpenCL on PyPy</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1491</link>
    <description>&lt;pre&gt;I prefer top-posting, let me know if it's a no-no.
It seems the way to proceed would be by creating another parallel 
low-level interface to the system calls via cffi without pyboost, and 
then slowly replace the existing low-level interfaces with the new ones 
until we can eventually remove the old ones. Hopefully that will not 
cause too many memory leaks in the transition.

How is the coverage of the test suite, can we be reasonably confident 
that if tests pass and the benchmark does not suffer, the package still 
is functional?
Matti

On 05/19/2013 02:54 AM, Andreas Kloeckner wrote:

&lt;/pre&gt;</description>
    <dc:creator>Matti Picus</dc:creator>
    <dc:date>2013-05-20T18:15:37</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1490">
    <title>Re: PyOpenCL on PyPy</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1490</link>
    <description>&lt;pre&gt;Hi Matti,

Matti Picus &amp;lt;matti-4WHkSwmb+mMQhDRD3CWCHA&amp;lt; at &amp;gt;public.gmane.org&amp;gt; writes:

I've been thinking about replacing the Boost.Python bits in PyOpenCL
with cffi, and I'd be interested in figuring out how feasible all this
is. It certainly looks doable from my point of view. My main objective
in getting PyPy to work with PyOpenCL is making kernel invocation even
cheaper than it currently is, or rather, be able to afford more
kernel-invocation-time niceties than we currently can.

I'd obviously like to preserve PyOpenCL's documented interface.

Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-18T23:54:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1489">
    <title>PyOpenCL + Beignet</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1489</link>
    <description>&lt;pre&gt;Hi all,

Any idea how I could get access to my HD4000 in Debian?  PyOpenCL seems to
be working fine using the 'AMD Accelerated Parallel Processing' device,
which I'm guessing is my CPU?  But I'd like to be able to use the HD4000.

I installed beignet0.0.1 and beignet-dev packages and naively created a
cl.icd file under /etc/OpenCL/vendors that had just a 'libcl.so' entry (and
temporarily removed amdocl64.icd), but then got this error:

*1 cl.get_platforms()*
*LogicError: clGetPlatformIDs failed: platform not found khr*

Any steps on how to get Beignet working would be greatly appreciated.

Thank you.
&lt;/pre&gt;</description>
    <dc:creator>a cow-like object</dc:creator>
    <dc:date>2013-05-18T20:35:18</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1488">
    <title>PyOpenCL on PyPy</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1488</link>
    <description>&lt;pre&gt;Hi.
I am interested in getting PyOpenCL to work with PyPy, an
implementation of cpython with a JITwww.pypy.org  . Has there been any
discussion or thought about doing this? PyPy has a basic
implementation of numpy called numpypy that I contribute to, and it
has a rudimentary numpy-compatible c interface available as an
external module at
https://bitbucket.org/antocuni/numpypy_c
The PyPy team has a cpython-compatible replacement for ctypes called
cffi, that is jit-friendly on PyPy and no slower than ctypes on
cpython.
So it seems like all the pieces exist to start, is anyone else interested in
getting the work done?
Or are there blocking issues I do not understand?
Matti


&lt;/pre&gt;</description>
    <dc:creator>Matti Picus</dc:creator>
    <dc:date>2013-05-18T18:56:39</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1487">
    <title>Re: Error running "abstract" example</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1487</link>
    <description>&lt;pre&gt;
Ah, sorry. I meant:

print prg.sum.num_args

 I was thinking of "knl = prg.sum" as an instance of cl.Kernel.

 Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-15T13:23:44</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1486">
    <title>Re: Error running "abstract" example</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1486</link>
    <description>&lt;pre&gt;Dear cow-like-object,

a cow-like object &amp;lt;acowlikeobject-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; writes:

This looks like the Intel GPU CL implementation has a bug--it seems to
be miscounting arguments. You can verify this by inserting

print knl.sum.num_args

in the program that works. If my guess is right, it'd be great if you
could report this to Intel, here:

http://software.intel.com/en-us/forums/intel-opencl-sdk/

As a workaround, just rip out the assert that generated the error. You
lose a sanity check, but the functionality shouldn't be affected. (Or
run with 'python -O', which just disables *all* asserts.)

Hope that helps,
Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-15T02:50:41</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1485">
    <title>Error running "abstract" example</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1485</link>
    <description>&lt;pre&gt;Hi all,

Very new to PyOpenCl.  Trying to get my feet wet by running the examples.

The following code works fine when running on Windows 8 / Intel HD 4000. I
see a result of 0.0.

*import pyopencl as cl*
*import numpy*
*import numpy.linalg as la*

*a = numpy.random.rand(50000).astype(numpy.float32)*
*b = numpy.random.rand(50000).astype(numpy.float32)*
*
*
*ctx = cl.create_some_context()*
*queue = cl.CommandQueue(ctx)*
*
*
*mf = cl.mem_flags*
*a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)*
*b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)*
*dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)*
*
*
*prg = cl.Program(ctx, """*
*    __kernel void sum(__global const float *a,*
*    __global const float *b, __global float *c)*
*    {*
*      int gid = get_global_id(0);*
*      c[gid] = a[gid] + b[gid];*
*    }*
*    """).build()*
*
*
*prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)*
*
*
*a_plus_b = numpy.empty_like(a)*
*cl.enqueue_copy(queue, a_plus_b, dest_buf)*
*
*
*pr&lt;/pre&gt;</description>
    <dc:creator>a cow-like object</dc:creator>
    <dc:date>2013-05-14T23:39:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1484">
    <title>Re: Mutiple-matrix products and two questions.</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1484</link>
    <description>&lt;pre&gt;Hi Pedro,

Unfortunately, I do not have time right now to write custom kernels,
so I'll cheat a little.

Attached is an archive with a Python script that performs the
calculation using Till's algorithm (with sizes tuned down to 80 and 50
so that my laptop can handle them). It uses reikna (see
http://reikna.publicfields.net) 0.2.4 to generate kernels, which I
included in the archive with comments in the main script about when
they are called and with which parameters. These kernels are rendered
versions of https://github.com/Manticore/reikna/blob/develop/reikna/transpose.mako
and https://github.com/Manticore/reikna/blob/develop/reikna/matrixmul.mako
(look at them if you want to know where all the magic numbers come
from) which are, in turn, just generalized transposition and dot
kernels from nVidia CUDA/OpenCL SDK examples. There are some weirdly
looking macros (which mostly do nothing in this case) in the rendered
versions, but I hope they are still quite readable.

Best regards,
Bogdan

On Sun, May 12, 2013&lt;/pre&gt;</description>
    <dc:creator>Bogdan Opanchuk</dc:creator>
    <dc:date>2013-05-12T06:42:41</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1483">
    <title>Re: Mutiple-matrix products and two questions.</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1483</link>
    <description>&lt;pre&gt;Hi Bogdan,
What does the .cl file look like?
As a beginner, I would certainly appreciate being able bto see a complete
example,
Thanks,
Pedro


On Sat, May 11, 2013 at 5:57 PM, Bogdan Opanchuk &amp;lt;mantihor-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:

&lt;/pre&gt;</description>
    <dc:creator>Pedro Marcal</dc:creator>
    <dc:date>2013-05-12T02:06:17</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1482">
    <title>Re: Mutiple-matrix products and two questions.</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1482</link>
    <description>&lt;pre&gt;Hi Till,

I'd do it like this:


out = np.empty(800, 500, 500)

t = a.reshape(800, 500 * 500).T
mat1 = t.reshape(500, 500, 800).T
# this results in a permutation
# mat1 = a.transpose(1, 0, 2)

mat2 = mat1.T
# this results in a permutation
# mat2 = mat1.transpose(0, 2, 1) ( == a.transpose(1, 2, 0) )
#
# mat2 can be expressed in terms of "a" as well, but if transposes
involve actual data movement
# and not just strides swap, this will be faster.

out = batched_dot(mat1, mat2)


Here batched_dot() goes over the 0-th dimension of both matrices and
dots dimensions 1 and 2. As far as I know, numpy does not have such
function, but it is a simple extension of the GPU dot kernel.


It is a common pattern and it does not involve any significant
performance loss, as long as your max_gid is relatively close to the
actual global size.

Best regards,
Bogdan

On Sun, May 12, 2013 at 7:01 AM, Till Stensitzki
&amp;lt;tillsten-ntLbU+u2Fft35pIdRWVIsoQuADTiUCJX&amp;lt; at &amp;gt;public.gmane.org&amp;gt; wrote:

&lt;/pre&gt;</description>
    <dc:creator>Bogdan Opanchuk</dc:creator>
    <dc:date>2013-05-12T00:57:38</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1481">
    <title>Mutiple-matrix products and two questions.</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1481</link>
    <description>&lt;pre&gt;Hi,
i already use some simple kernels too accelerate my data 
fitting routines. One embarrassingly parallel part i am failing
to accelerate with OpenCl is the following:

a.shape is (500, 800, 10)

out = np.empty(800, 500, 500)
for i in range(800):
    mat = a[:, i, :]
    out[i, :, :] = np.dot(mat, mat.T)

maybe anyone can help. Note that
i could change the the dim order of a if would make it faster.

Some questions:
Some OpenCl says it would automatically set an optimal set an
optimal local workgroup size, but my tests show they are not.

Also the global workgroup size has to be a multiple of the local
workgroup size. To use an faster local workgroup size ((128, 1) in my case)
i use an additional kernel parameter max_gid and test at the beginning
of the kernel 

if (get_global_id()&amp;lt;max_gid)  {return}; 

Is there are a better way?


greetings and thanks for the nice package!
Till Stensitzki





&lt;/pre&gt;</description>
    <dc:creator>Till Stensitzki</dc:creator>
    <dc:date>2013-05-11T21:01:31</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1480">
    <title>Re: batch enqueue</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1480</link>
    <description>&lt;pre&gt;Thanks for the tips, they helped a lot. I've got the main loop down to this:

# set-args ahead of time

queues = [p._enqueue_args[0] for p in plans]
kerns = [p._enqueue_args[1] for p in plans]
gsize = [p._enqueue_args[2] for p in plans]
lsize = [p._enqueue_args[3] for p in plans]

[map(cl.enqueue_nd_range_kernel, queues, kerns, gsize, lsize)
     for i in xrange(n_calls)]

This appears to be low-enough overhead for the graph I'm working with, the
end-to-end wall time is about the same as the total time in the kernels, as
far as I can tell with all the noise in the timing measurements.


On Tue, May 7, 2013 at 6:56 PM, Andreas Kloeckner
&amp;lt;lists-vNJaATnTIRFBA2mtUNAoal6hYfS7NtTn&amp;lt; at &amp;gt;public.gmane.org&amp;gt;wrote:

&lt;/pre&gt;</description>
    <dc:creator>James Bergstra</dc:creator>
    <dc:date>2013-05-08T15:22:07</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1479">
    <title>Re: batch enqueue</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1479</link>
    <description>&lt;pre&gt;Hi James,

James Bergstra &amp;lt;james.bergstra-Re5JQEeQqe8AvxtiuMwx3w&amp;lt; at &amp;gt;public.gmane.org&amp;gt; writes:

What you're saying is that Kernel.__call__ is too slow for your current
purposes, correct?

First off, it'd be great if you could take a look at Kernel.set_args:

https://github.com/inducer/pyopencl/blob/master/pyopencl/__init__.py#L559

and Kernel.__call__:

https://github.com/inducer/pyopencl/blob/master/pyopencl/__init__.py#L528

to see if there's any fat that could be trimmed from your
perspective. I've tried to keep this code path as quick as I could, but
there might be something I've overlooked.

Next, if there's nothing to be had in that direction, you can simply
call Kernel.set_args once and then repeatedly call
cl.enqueue_nd_range_kernel() as done in Kernel.__call__ (see source link
above). That should get reasonably close to the rate that the OpenCL API
itself can sustain.

Hope that helps,
Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-07T22:56:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1478">
    <title>batch enqueue</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1478</link>
    <description>&lt;pre&gt;Hi, I have written an opencl program that involves relatively small
kernels. For a certain benchmarking script, I have added up the time used
by kernels as 0.06 seconds, while the tightest python loop I can think of
still requires .2 seconds to execute the 5000-or-so kernel calls.  The
program involves repeatedly looping through the same kernels, with the same
arguments, so I was wondering if there was a way to enqueue several nd
range kernels at once, at least from Python's perspective. Is there such a
thing?

In other words, supposing I have kernels A and B, taking arguments x and y,
my program consists of:
A(x); B(y); A(x); B(y); ....

Ideally, I would like to enqueue 100 copies of the kernel sequence [(A, x),
(B, y)], but being able to enqueue even [(A, x), (B, y)] with one call
instead of 2 could be a big help.

- James
&lt;/pre&gt;</description>
    <dc:creator>James Bergstra</dc:creator>
    <dc:date>2013-05-07T22:19:15</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1477">
    <title>Re: RadixSort shape and dtype</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1477</link>
    <description>&lt;pre&gt;Hello Andreas,

I think it is fine as it is, given that the .view() seems to be not too heavy on the performance. 


Dieter

Am 07.05.2013 um 04:19 schrieb Andreas Kloeckner &amp;lt;lists-vNJaATnTIRFBA2mtUNAoal6hYfS7NtTn&amp;lt; at &amp;gt;public.gmane.org&amp;gt;:


&lt;/pre&gt;</description>
    <dc:creator>Dieter Morgenroth</dc:creator>
    <dc:date>2013-05-07T18:57:44</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1476">
    <title>Re: RadixSort shape and dtype</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1476</link>
    <description>&lt;pre&gt;Hi Dieter,

Dieter Morgenroth &amp;lt;dieter.morgenroth-S0/GAf8tV78&amp;lt; at &amp;gt;public.gmane.org&amp;gt; writes:

Sorry--I didn't look closely enough the first time around. You told the
sort that you passed it a float4, so you can't blame it for returning
one to you. :) What you're seeing is actually correct behavior. Just use
.view() as you suggest to restore the inner dimension.

Based on your use of the sort, I was wondering whether it might be
worthwhile to allow custom reordering code snippets... so that you
wouldn't have to lie about the type of your array. :) I'm a bit on the
fence on this. If you've got an opinion, let me know.

Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-07T02:19:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1475">
    <title>Re: __getitem__ for pyopencl array</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1475</link>
    <description>&lt;pre&gt;Hi Alex,

Alex Nitz &amp;lt;alex.nitz-beguxJGv3xY&amp;lt; at &amp;gt;public.gmane.org&amp;gt; writes:

First of all, thanks for your contribution! I'm a bit hesitant to apply
this patch, because sub-buffers (which your patch implicitly uses, see
clCreateSubBuffer in the CL spec) are allowed to have alignment
requirements that make this routine fail. The better way to implement
this is to use the original buffer and store an offset to the intended
beginning of the data. I'll introduce this after the 2013.1 release,
which is due soon. (as soon as I sort out the current Mac trouble)

Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-07T01:39:48</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1474">
    <title>__getitem__ for pyopencl array</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1474</link>
    <description>&lt;pre&gt;Hello,

I am mostly a pycuda user, but am investigating trying to use some of my
codes with pyopencl. My codes make heavy use of the numpy-like array. I
noticed that there doesn't seem to yet be a "__getitem__" function yet
defined, although the buffer objects themselves have one.

My needs are basically met by the version that is in pycuda, so I have
created a short patch to add the same behavior to pyopencl. It is fairly
limited in that it only supports 1-dimensional, non-strided slices. Is a
more comprehensive functionality already in the works? If not, would it be
possible to get this patch applied?

Thanks,

Alex
&lt;/pre&gt;</description>
    <dc:creator>Alex Nitz</dc:creator>
    <dc:date>2013-05-05T22:44:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1473">
    <title>Re: RadixSort shape and dtype</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1473</link>
    <description>&lt;pre&gt;Hello Andreas,

I extracted the problem to a few lines here. Hope this helps. I also 
found the "view" function, that seems to fix the problem.
Or am I doing something wrong? I am really not sure about the arguments 
string. Maybe the float4 is my mistake?

-Dieter

import pyopencl as cl
import pyopencl.array as cl_array
import pyopencl.algorithm as cl_algorithm
import pyopencl.tools as cl_tools
import numpy
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

#prepare the sort algorithm
arguments = "unsigned int *dindices, float4 *dpos"
key_expr = "dindices[i]"
sort_arg_names = ["dindices", "dpos"]
sortCl = cl_algorithm.RadixSort(ctx, arguments, key_expr, 
sort_arg_names, bits_at_a_time=2, index_dtype=numpy.uint32 , 
key_dtype=numpy.uint32  , options=[])

#input arrays on host and device
allocNum = 101
hpos = numpy.ascontiguousarray(numpy.random.random_sample((allocNum, 
4)), dtype=numpy.float32)
hindices = numpy.random.randint(0, allocNum, allocNum)
dindices = cl_array.to_device(queue, hindices)
dp&lt;/pre&gt;</description>
    <dc:creator>Dieter Morgenroth</dc:creator>
    <dc:date>2013-05-05T20:43:13</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.opencl/1472">
    <title>Re: RadixSort shape and dtype</title>
    <link>http://permalink.gmane.org/gmane.comp.python.opencl/1472</link>
    <description>&lt;pre&gt;

Uh-oh. That looks like something in PyOpenCL decided that you would like
an array of float4s. That is weird. Can you please send a small snippet
of code that reproduces this to help me fix this?

Thanks,
Andreas

&lt;/pre&gt;</description>
    <dc:creator>Andreas Kloeckner</dc:creator>
    <dc:date>2013-05-05T00:31:59</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.python.opencl">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.python.opencl</link>
  </textinput>
</rdf:RDF>
