<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.clustering.open-mpi.user">
    <title>gmane.comp.clustering.open-mpi.user</title>
    <link>http://blog.gmane.org/gmane.comp.clustering.open-mpi.user</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19141"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19130"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19129"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19126"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19120"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19112"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19110"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19106"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19098"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19093"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19083"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19076"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19071"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19064"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19063"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19060"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19054"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19046"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19038"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19033"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19141">
    <title>[OMPI users] opening a file with MPI-IO</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19141</link>
    <description>&lt;pre&gt;Dear users,

I have been banging my head against the wall for some time to find a 
reliable and portable way to determine if a call to MPI::File::Open() 
was successful or not.

Let me give some background information first. We develop an open-source 
astrophysical modeling code called Cloudy. This is used by many 
scientists on a variety of platforms. We obviously have no control over 
the MPI version that is installed on that platform, it may not even be 
open-MPI. So what we need is a method that is supported by all MPI distros.

Our code is written in C++, so we use the C++ version of the MPI and 
MPI-IO libraries.

Any help would be greatly appreciated.


Cheers,

Peter.

&lt;/pre&gt;</description>
    <dc:creator>Peter van Hoof</dc:creator>
    <dc:date>2013-05-17T09:00:36</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19130">
    <title>[OMPI users] plm:tm: failed to spawn daemon, error code = 17000 Error when running jobs on 600 or more nodes</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19130</link>
    <description>&lt;pre&gt;Dear Support,

We are having an issue with our OMPI runs. When we run jobs on &amp;lt;=550 
machines (550 x 16 cores) then they work without any problem. As soon as 
we run them on 600 or more machines we get the "plm:tm: failed to spawn 
daemon, error code = 17000" Error

We are using:

OpenMPI ver: 1.6.4 (Compiled with GCC v4.4.6)
Torque ver: 2.5.12

The ompi_info's output is attached.


The Environmentstats have been pasted below.


Please assist.


env       envsubst
[ocfacc&amp;lt; at &amp;gt;cyan01 fullrun]$ env
MODULE_VERSION_STACK=3.2.10
OMPI_MCA_mtl=^psm
MANPATH=/local/software/openmpi/1.6.4/gcc/share/man:/local/software/moab/6.1.10/man:/usr/local/share/man:/usr/share/man/overrides:/usr/share/man:/local/Modules/default/share/man
HOSTNAME=cyan01
SHELL=/bin/bash
TERM=xterm
HISTSIZE=1000
QTDIR=/usr/lib64/qt-3.3
OLDPWD=/home/ocfacc/hpl/fullrun/results
QTINC=/usr/lib64/qt-3.3/include
LC_ALL=POSIX
USER=ocfacc
LD_LIBRARY_PATH=/local/software/openmpi/1.6.4/gcc/lib:/local/software/torque/default/lib
LS_COLORS=rs=0:di=01;34:ln=01;36:m&lt;/pre&gt;</description>
    <dc:creator>Qamar Nazir</dc:creator>
    <dc:date>2013-05-16T16:09:19</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19129">
    <title>[OMPI users] Configuration with Intel C++ Composer 12.0.2 on OSX10.7.5</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19129</link>
    <description>&lt;pre&gt;
I am having trouble configuring OpenMPI-1.6.4 with the Intel C/C++ composer (12.0.2). My OS is OSX 10.7.5.

I am not a computer whizz so I hope I can explain what I did properly:

1) In bash, I did source /opt/intel/bin/compilervars.sh intel64 
and then echo PATH showed: 
/opt/intel/composerxe-2011.2.142/bin/intel64:/opt/intel/composerxe-2011.2.142/mpirt/bin/intel64:/opt/intel/composerxe-2011.2.142/bin:/Library/Frameworks/EPD64.framework/Versions/Current/bin:/Library/Frameworks/Python.framework/Versions/Current/bin:.:/Library/Frameworks/EPD64.framework/Versions/Current/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin

2) which icc and which icpc showed:
/opt/intel/composerxe-2011.2.142/bin/intel64/icc
and
/opt/intel/composerxe-2011.2.142/bin/intel64/icpc

So that all seems okay to me. Still when I do
./configure CC=icc CXX=icpc F77=ifort FC=ifort --prefix=/opt/openmpi-1.6.4
from the folder in which the extracted OpenMPI files sit, I get

=========================================================&lt;/pre&gt;</description>
    <dc:creator>Geraldine Hochman-Klarenberg</dc:creator>
    <dc:date>2013-05-16T15:57:42</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19126">
    <title>[OMPI users] distributed file system</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19126</link>
    <description>&lt;pre&gt;Hi

Do we need distributed file system (like NFS) when running MPI program on
multiple machines?

thanks,
Reza
_______________________________________________
users mailing list
users&amp;lt; at &amp;gt;open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users&lt;/pre&gt;</description>
    <dc:creator>Reza Bakhshayeshi</dc:creator>
    <dc:date>2013-05-16T15:24:41</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19120">
    <title>[OMPI users] Subject: Building openmpi-1.6.4 with 64-bit integers</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19120</link>
    <description>&lt;pre&gt; 
 Dear mpi team / users:
 
 To get a mpi with 64-bit integers (linux system:
 ubuntu 12.04) I invoked the following
 configuration options:
 
 ./configure --prefix=/opt/openmpi CXX=icpc CC=icc F77=ifort FC=ifort
 FFLAGS=-i8 FCFLAGS=-i8
 
 The subsequent make/install scripts apparently
 went through smoothly, but when I check
 
 ompi_info -a | grep 'Fort integer size'
 
 the result reads:
 
 Fort integer size: 4
 
 What went awry?
 For all hints and suggestions many thanks in advance,
 Hans H.
&lt;/pre&gt;</description>
    <dc:creator>H Hogreve</dc:creator>
    <dc:date>2013-05-15T23:42:48</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19112">
    <title>[OMPI users] Unexpected behavior: MPI_Comm_accept, MPI_Comm_connect, and MPI_THREAD_MULTIPLE</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19112</link>
    <description>&lt;pre&gt;I'm been playing with come code to try and become familiar with
MPI_Comm_accept and MPI_Comm_connect to implement an MPI
client/server.  The code that I have simply sends a single MPI_INT,
the client process pid, to the server and then disconnects.  The code
that I have works for a few test runs but then on the 2nd or 3rd
client connection, the server seems to stop responding and the client
spins 100% CPU in the call to MPI_Comm_accept.  Am I doing something
wrong in my code?  Thanks in advance for any help.  First, an example
run ...

In terminal #1, start the name service

$ ompi-server -r ${PREFIX}/var/run/ompi-server/uri.txt
$

In terminal #2, start the server code

$ mpirun -mca btl tcp,sm,self \
--ompi-server file:${PREFIX}/var/run/ompi-server/uri.txt mpi-server
mpi-server pid 41556
Opened port 2011758592.0;tcp://10.161.1.73:51113+2011758593.0;\
tcp://10.161.1.73:51114:300
MPI_Info_set("ompi_global_scope", "true")
Published {"mpi-server-example", "2011758592.0;\
tcp://10.161.1.73:51113+2011758593.0;tcp&lt;/pre&gt;</description>
    <dc:creator>Damien Kick</dc:creator>
    <dc:date>2013-05-14T18:15:31</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19110">
    <title>[OMPI users] MPI_SUM is not defined on the MPI_INTEGER datatype</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19110</link>
    <description>&lt;pre&gt;Hello I'm kuni255

I build bewulf type PC Cluster (Cent OS release 6.4). And I studing
about MPI.(Open MPI Ver.1.6.4) I tried following sample which using
MPI_REDUCE.

Then, Error occured.

This cluster system consist of one head node and 2 slave nodes.
And sharing home directory in head node by NFS. so Open MPI is installed
each nodes.

When I test this program on only head node, program is run correctly.
and output result.
But When I test this program on only slave node, same error occured.

Please tell me, good idea : )

Error message
[bwslv01:30793] *** An error occurred in MPI_Reduce: the reduction
operation MPI_SUM is not defined on the MPI_INTEGER datatype
[bwslv01:30793] *** on communicator MPI_COMM_WORLD
[bwslv01:30793] *** MPI_ERR_OP: invalid reduce operation
[bwslv01:30793] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 30793 on
node bwslv01 exiting improperly. There a&lt;/pre&gt;</description>
    <dc:creator>Hayato KUNIIE</dc:creator>
    <dc:date>2013-05-14T15:39:06</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19106">
    <title>[OMPI users] How to Read the Rank from the MPI_TASK</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19106</link>
    <description>&lt;pre&gt;Hello All,I am new user to this mailing list and am trying to get familiar with mpi.I am using open mpi version 1.4.3I have a simple shell script that I have writtenThe Shell Script that I have is as followshelloWorld.sh&amp;lt;code&amp;gt;#!/bin/bashecho " Hello World from Rank $rank"&amp;lt;code&amp;gt;
The MPI Command that I am executing is as followsmpirun -np 3 helloWorld.sh
What I want to know how what parameter do I need to pass from the mpirun command line so that I can read the rank for each task within my helloWorld Shell Script.
Please excuse me if this is a duplicate entry. I have searched through all the forums and did not get the answer to my question.Thanks in advance for all your help.Regards,Deepak       _______________________________________________
users mailing list
users&amp;lt; at &amp;gt;open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users&lt;/pre&gt;</description>
    <dc:creator>deepak mehta</dc:creator>
    <dc:date>2013-05-10T20:39:34</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19098">
    <title>[OMPI users] /usr/bin/ld: skipping incompatible ......</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19098</link>
    <description>&lt;pre&gt;Dear All,
I have recently installed gcc 4.7.3 on my cent OS 6.4. Moreover, I have
compiled openmpi 1.6.4 with the above compiler.

My LD_LIBRARY_PATH is set correctly and it points to both /lib and /lib64
where libgfortran.so and libgcc_s.so for 32 and 64 bits are located.

Every time I compile a Fortran, a C or a C++ source code with the wrapper
mpif90, mpicc or mpicx I get this warning:

/usr/bin/ld: skipping incompatible /scratch/home0/pmatteo
/research/lib_install/lib/libgfortran.so when searching for -lgfortran

or

/usr/bin/ld: skipping incompatible /scratch/home0/pmatteo
/research/lib_install/lib/libgcc_s.so when searching for -lgcc_s

I have switch my LD_LIBRARY_PATH as suggested in this thread:
http://www.open-mpi.org/community/lists/users/2009/02/8067.php

but nothing change.

Any idea what I am doing wrong?

I know that it is just a warning but I would like to avoid it.

Thank you.


&lt;/pre&gt;</description>
    <dc:creator>Matteo Parsani</dc:creator>
    <dc:date>2013-05-08T19:12:55</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19093">
    <title>[OMPI users] MPI for real time analysis</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19093</link>
    <description>&lt;pre&gt;Are people in community using MPI libraries in their application for 
real time processing / analytics? I have access to a cluster with 
infiniband and want to test millions of hypotheses and depending upon 
which one passed would run an aggregate function over them. I would like 
to achieve a subsecond response for this and was wondering how one could 
take care of the network calls overhead? Thanks!

_______________________________________________
users mailing list
users&amp;lt; at &amp;gt;open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users&lt;/pre&gt;</description>
    <dc:creator>RoboBeans</dc:creator>
    <dc:date>2013-05-07T17:48:44</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19083">
    <title>[OMPI users] Problems with building with VS 2010 and VS 2012</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19083</link>
    <description>&lt;pre&gt;Hello everyone,

I'm having troubles with building OpenMPI v1.6.4 (actually all v1.6 and
v1.7 versions) release sources (in both x86 and x64 modes) on Windows
machine using VS 2010 and VS 2012 compilers and cmake v2.8.10.2. Just to
note, I tried it with VS2008 too and it worked fine in x86 mode, but failed
in x64 mode. Here is error I get with VS2010 and VS2012 when I execute
'cmake-gui ..' and push Configure:

CMake Error at C:/Program Files (x86)/CMake
2.8/share/cmake-2.8/Modules/CMakeFortranInformation.cmake:27
(get_filename_component):
get_filename_component called with incorrect number of arguments
Call Stack (most recent call first):
contrib/platform/win32/CMakeModules/setup_f77.cmake:18 (include)
contrib/platform/win32/CMakeModules/ompi_configure.cmake:616 (INCLUDE)
CMakeLists.txt:99 (BEGIN_CONFIGURE)

If I naively mask off using setup_F77 from ompi_configure.cmake (because I
don't wan't Fortran support), I get following compile error:

1&amp;gt;------ Build started: Project: libmpi_cxx, Configuration: Debug&lt;/pre&gt;</description>
    <dc:creator>Nenad Vujicic</dc:creator>
    <dc:date>2013-05-07T15:16:37</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19076">
    <title>[OMPI users] Building Open MPI with LSF</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19076</link>
    <description>&lt;pre&gt;Hi everyone,

I want to install OpenMPI on LSF cluster in our organization. I am not
proficient with Linux/LSF, and some of my questions might be from lack of
understanding of the system, and not related to OpenMPI directly.

So far I found these bits of information on the site of OpenMPI

*1.* *OpenMPI support for 1.6 seems to be broken, and was fixed maybe in
1.7?*
http://www.open-mpi.org/community/lists/users/2013/03/21640.php

*2. The installation on LSF is supposed to be easy:*
http://www.open-mpi.org/faq/?category=building#build-rte-lsf


My questions are:
*
*
*1*. What is the latest stable version that is known to integrate in a
native way with LSF?
*2*. When building with LSF support, in what directory should I run the
./configure and makes scripts from? Should I be logined into one of the
hosts of LSF cluster?
*3*. Will these scripts copy openmpi shared libraries into each host on the
cluster?
*4*. Where will the mpi compiler be after the installation? What include
pathes and libraries should I add &lt;/pre&gt;</description>
    <dc:creator>Andrey Rubshtein</dc:creator>
    <dc:date>2013-05-07T14:09:42</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19071">
    <title>[OMPI users] running openmpi with specified lib path</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19071</link>
    <description>&lt;pre&gt;Hi folks,

I am testing our cluster with module environment, and am having a
headache to understand openmpi 1.7.2!!! So our system currently has
openmpi 1.6.3 (at default location /usr/local), 1.6.4 and 1.7.2 compiled
with intel compilers (installed at /opt/apps). In order to use openmpi
1.7.2 for example, I tried:

$ module load mpi/openmpi-1.7.2_composer_xe_2013.3.163
$ module load
apps/abinit-7.2.1_composer_xe_2013.3.163_openmpi-1.7.2_intel_fftw3-mkl
$ mpirun ./mpihello_intel
mca: base: component_find: unable to open
/usr/local/lib/openmpi/mca_ess_hnp:
/usr/local/lib/openmpi/mca_ess_hnp.so: undefined symbol:
orte_local_jobdata (ignored)
mca: base: component_find: unable to open
/usr/local/lib/openmpi/mca_ess_slurm:
/usr/local/lib/openmpi/mca_ess_slurm.so: undefined symbol:
orte_orted_exit_with_barrier (ignored)
mca: base: component_find: unable to open
/usr/local/lib/openmpi/mca_ess_slurmd:
/usr/local/lib/openmpi/mca_ess_slurmd.so: undefined symbol:
orte_pmap_t_class (ignored)
mpirun: symbol lookup error:&lt;/pre&gt;</description>
    <dc:creator>Duke Nguyen</dc:creator>
    <dc:date>2013-05-07T11:36:39</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19064">
    <title>[OMPI users] Help diagnosing problem: not being able to run MPIcode across computers</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19064</link>
    <description>&lt;pre&gt;Hi,

I have used OpenMPI before without any troubles, and configured MPICH,
MPICH2 and OpenMPI in many different machines before, but recently we
upgraded the OS to Fedora 17, and now I'm having trouble running an MPI
code in two of our machines connected via a switch.

I thought perhaps the old installation was giving problems, so I
reinstalled OpenMPI (1.6.4) and I have no trouble when running a
parallel code in just one node. I also don't have any trouble ssh'ing
(without need for password) between these machines, but when I try to
run a parallel job spanning both machines, I get a hanged mpiexec
process in the submitting machine, and an "orted" process in the other
machine, but nothing moves. 

I guess it is an issue with libraries and/or different MPI versions (the
machines have other site-wide MPI libraries installed), but I'm not sure
how to debug the issue. I looked in the FAQ, but I didn't find anything
relevant. Issue
http://www.open-mpi.org/faq/?category=running#intel-compilers-static is
different&lt;/pre&gt;</description>
    <dc:creator>Angel de Vicente</dc:creator>
    <dc:date>2013-05-04T23:54:00</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19063">
    <title>[OMPI users] libtool *.la files with references to install dir</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19063</link>
    <description>&lt;pre&gt;Hello,
when installing openmpi-1.6.4 (`cd /root/openmpi-1.6.4; ./configure --prefix=/usr/opt/openmpi-1.6.4/gcc-4.7.2/mxib --with-openib --with-mx=/opt/mx/1.2.16/3.2.0-4-amd64 --with-slurm; make; make install`)
we found some references to non-existing paths in *.la files as in the
following example:

`cat /usr/opt/openmpi-1.6.4/gcc-4.7.2/mxib/lib/libmpi.la` (this is the
installed version of openmpi after the `make install` from above):

...
dependency_libs=' -L/root/openmpi-1.6.4/opal/mca/hwloc/hwloc132/hwloc/src  -ldl -lrt -lnsl -lutil -lm'
...

Now applications using GNU libtool will stop here during their
compilation with an error because this path (which is a subdirectory of
the path were I compiled openmpi on a different system!) does not exist.
(And even if it would exist this would lead to a "permission denied"
error because /root is not readable ...and finally there are no libs in
the src directory /root/openmpi-1.6.4/opal/mca/hwloc/hwloc132/hwloc/src).

# are the *.la files necessary at all?
(probabl&lt;/pre&gt;</description>
    <dc:creator>Stefan Friedel</dc:creator>
    <dc:date>2013-05-03T14:19:03</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19060">
    <title>[OMPI users] Message queue in MPI?</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19060</link>
    <description>&lt;pre&gt;If I'm using MPI_Send(...) and MPI_Recv(...) in a producer/consumer
model and choose not to buffer messages internally (in the app),
allowing them to acumulate in the MPI layer, how large of an MPI
message queue can I expect before something breaks?

---John
&lt;/pre&gt;</description>
    <dc:creator>John Chludzinski</dc:creator>
    <dc:date>2013-05-02T19:23:10</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19054">
    <title>[OMPI users] How to reduce Isend &amp; Irecv bandwidth?</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19054</link>
    <description>&lt;pre&gt;Hi,

I have a program where each MPI rank hosts a set of data blocks. After
doing computation over *some of* its local data blocks, each MPI rank needs
to exchange data with other ranks. Note that the computation may involve
only a subset of the data blocks on a MPI rank. The data exchange is
achieved at each MPI rank through Isend and Irecv and then Waitall to
complete the requests. Each pair of Isend and Irecv exchanges a
corresponding pair of data blocks at different ranks. Right now, we do
Isend/Irecv for EVERY block!

The idea is that because the computation at a rank may only involves a
subset of blocks, we could mark those blocks as dirty during the
computation. And to reduce data exchange bandwidth, we could only exchanges
those *dirty* pairs across ranks.

The problem is: if a rank does not compute on a block 'm', and if it does
not call Isend for 'm', then the receiving rank must somehow know this and
either a) does not call Irecv for 'm' as well, or b) let Irecv for 'm' fail
gracefully.

My questi&lt;/pre&gt;</description>
    <dc:creator>Thomas Watson</dc:creator>
    <dc:date>2013-05-01T17:28:26</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19046">
    <title>[OMPI users] job termination on grid</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19046</link>
    <description>&lt;pre&gt;Hello,



My recent job started normally but after a few hours of running died with
the following message:



--------------------------------------------------------------------------

A daemon (pid 19390) died unexpectedly with status 137 while attempting

to launch so we are aborting.



There may be more information reported by the environment (see above).



This may be because the daemon was unable to find all the needed shared

libraries on the remote node. You may set your LD_LIBRARY_PATH to have the

location of the shared libraries on the remote nodes and this will

automatically be forwarded to the remote nodes.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

mpirun noticed that the job aborted, but has no info as to the process

that caused that situation.



The scheduling script is below:



#$ -S /bin/bash

#$ -cwd

#$ -N SC3blastx_64-96thr

#$ -pe openmpi* 64-96

#$ -l h_rt=24:00:00,vf=3G
&lt;/pre&gt;</description>
    <dc:creator>Vladimir Yamshchikov</dc:creator>
    <dc:date>2013-04-30T19:26:53</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19038">
    <title>[OMPI users] Broadcast problem</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19038</link>
    <description>&lt;pre&gt;I have a number of processes split into sender and receivers.
Senders read large quantities of randomly organised data into buffers for transmission to receivers.
When a buffer is full it needs to be transmitted to all receivers this repeats until all the data is transmitted.

Problem is that MPI_Bcast must know the root it is to receive from and therefore cant receive 'blind' from the first full sender.
Scatter would be inneffieienct because a few senders wont have anything to send - so its wasteful to transmit those empty buffers repeatedly. 

Any ideas?
Can Bcast recievers be promiscuous?

Thanks Randolph_______________________________________________
users mailing list
users&amp;lt; at &amp;gt;open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users&lt;/pre&gt;</description>
    <dc:creator>Randolph Pullen</dc:creator>
    <dc:date>2013-04-30T06:43:23</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19033">
    <title>[OMPI users] LD_LIBRARY_PATH Problem</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19033</link>
    <description>&lt;pre&gt;Dear users,
I am getting following error while doing a calculation. The job is getting
terminated before writing anything in output file .
==========================================================================================
ssh: ibc18: Name or service not known^M
--------------------------------------------------------------------------
A daemon (pid 8103) died unexpectedly with status 255 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situ&lt;/pre&gt;</description>
    <dc:creator>sudhirs&lt; at &gt;</dc:creator>
    <dc:date>2013-04-29T13:12:50</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19029">
    <title>[OMPI users] Broadcast and root process</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/19029</link>
    <description>&lt;pre&gt;Hi,

I'm new on this list. I'm using MPI for years but I don't have written a
lot of code with MPI. Therefore is my question perhaps ridiculous:

I'm using a Computational Fluid Mechanics (CFD) Solver. This Solver uses
MPI to exchange the data between the different partitions. In this
solver the "root processor" is always the processor 1. So this proc
reads the input, broadcast a lot of things and writes the output.

During a time step the solver computes the reference pressure at a
point. This computation is done on a processor, which may not be the
root processor. Therefore after the computation a broadcast of the value
is necessary. For the moment in the code the broadcast is done with the
processor, where the reference pressure is computed, as root processor
(and not with the standard "root processor").

Is it false ? Must the root processor be the same during a computation
for all broadcasts ?

Best regards,
Guillaume
&lt;/pre&gt;</description>
    <dc:creator>giggzounet</dc:creator>
    <dc:date>2013-04-29T11:15:04</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.clustering.open-mpi.user">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.clustering.open-mpi.user</link>
  </textinput>
</rdf:RDF>
