<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.clustering.beowulf.general">
    <title>gmane.comp.clustering.beowulf.general</title>
    <link>http://blog.gmane.org/gmane.comp.clustering.beowulf.general</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29507"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29502"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29497"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29494"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29493"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29487"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29478"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29477"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29475"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29474"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29473"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29461"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29447"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29438"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29418"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29413"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29407"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29406"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29403"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29402"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29507">
    <title>Forward: RE:</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29507</link>
    <description>&lt;pre&gt;

I know Penguin runs the list, but I'm not sure who
to contact, I'll forward it to the list. Hopefully
someone will be able to provide an answer.

--
Doug

URL doesn't work (for me) &amp;gt;
either bounce or vanish down a /dev/null hole.
On Behalf Of Douglas Eadline
Computing To change your subscription (digest mode or unsubscribe)
visit
To change your subscription (digest mode or unsubscribe) visit
designated recipient, please notify the sender immediately, and delete
the original and any copies. Any use of the message by you is
prohibited.


&lt;/pre&gt;</description>
    <dc:creator>Douglas Eadline</dc:creator>
    <dc:date>2012-05-03T12:40:59</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29502">
    <title>Intel NUC</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29502</link>
    <description>&lt;pre&gt;http://www.theregister.co.uk/2012/05/01/intel_pi_rival_nuc/

 

Ohhh.... 

Thinking of how to cool a rack full of these things with 2x16Gbyte DIMMS
in each.

Looks like you could seal that case and use immersive cooling -
partially dip the case in the coolant but leave the top dry???

 

2x mini PCIe slots for that fast interconnect

Though - does anyone know much about networking over thunderbolt?

 

 

 

John Hearns | CFD Hardware Specialist | McLaren Racing Limited
McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK


T:  +44 (0) 1483 262000

D:  +44 (0) 1483 262352

F:  +44 (0) 1483 261928 
E:  john.hearns&amp;lt; at &amp;gt;mclaren.com

W: www.mclaren.com &amp;lt;http://www.mclaren.com/&amp;gt; 

 


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
&lt;/pre&gt;</description>
    <dc:creator>Hearns, John</dc:creator>
    <dc:date>2012-05-01T15:10:27</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29497">
    <title>yikes: intel buys cray's spine</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29497</link>
    <description>&lt;pre&gt;http://www.eetimes.com/electronics-news/4371639/Cray-sells-interconnect-hardware-unit-to-Intel

that's one market where AMD no longer plays eh?
&lt;/pre&gt;</description>
    <dc:creator>Mark Hahn</dc:creator>
    <dc:date>2012-04-25T02:58:39</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29494">
    <title>New industry for Iceland?</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29494</link>
    <description>&lt;pre&gt;Combine this article:

"A Cool Place for Cheap Flops"
http://www.hpcwire.com/hpcwire/2012-04-11/a_cool_place_for_cheap_flops.html

With this paper:

"Relativistic Statistical Arbitrage"
dspace.mit.edu/openaccess-disseminate/1721.1/62859

And it's looks like Iceland has a new industry: Datacenters for the
high-frequency trading (HFT) gang.

Just remember - you heard it here first, folks! ;)

&lt;/pre&gt;</description>
    <dc:creator>Prentice Bisbal</dc:creator>
    <dc:date>2012-04-20T13:37:34</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29493">
    <title>Next release of Open Grid Scheduler &amp; the Gompute UserGroup Meeting</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29493</link>
    <description>&lt;pre&gt;The next release of Open Grid Scheduler/Grid Engine will be released
at the Gompute User Group Meeting. The Gompute User Group Meeting is a
free, 2-day, HPC event in Gothenburg, Sweden.

Register for the event at: http://www.simdi.se/

** Please let me know if you are interested in a Grid Engine track.

Gridcore/Gompute contributed booth space at SC11 for the Grid Engine
2011.11 release (the first major release of open-source Grid Engine
after separation from Oracle), and joined the Open Grid Scheduler
project in April 2012.

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/
&lt;/pre&gt;</description>
    <dc:creator>Rayson Ho</dc:creator>
    <dc:date>2012-04-19T18:34:08</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29487">
    <title>Migrating from IB datagram mode to connected mode live ?</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29487</link>
    <description>&lt;pre&gt;-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi folks,

For hysterical raisins we have an IBM iDataPlex system which is
running QDR IB in datagram mode.  To that IB network we'll be adding
another QDR system which can only run in connected mode.

The kicker is that our IB network is used for GPFS over IPoIB and so
our NSD's will need to move to connected mode for the new system.

I've been Googling without success to find out if you can do such a
migration live (i.e. change the servers to connected mode, increase
their MTUs and then migrate clients to connected mode (we have enough
redundancy in servers to do this) or whether we'll need to schedule an
outage and take the whole system down and bring it back up in
connected mode.

Any thoughts?

cheers,
Chris
- -- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel&amp;lt; at &amp;gt;unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk+PcC8ACgkQO2KABBYQAh8wrwCghA14T85C0WIegdURbFtW5Spb
mDMAn0k/HTHFEi1avoJlSidrWa5qNCjP
=DBuj
-----END PGP SIGNATURE-----
&lt;/pre&gt;</description>
    <dc:creator>Christopher Samuel</dc:creator>
    <dc:date>2012-04-19T01:53:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29478">
    <title>Questions about upgrading InfiniBand</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29478</link>
    <description>&lt;pre&gt;Beowulfers,

I'm planning on adding some upgrades to my existing cluster, which has
66 compute nodes pluss the head node. Networking consists of a Cisco
7012 IB switch with 6 out of 12 line cards installed, giving me a
capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet
switches that have only six extra ports between them.

I'd like to add a Lustre filesystem (over InfiniBand)  to my cluster,
and then begin adding/replacing nodes in the cluster. Obviously, I'll
need to increase capacity of both my IB and ethernet networks. The
questions I have are about upgrading my InifiniBand.

1. It looks like QLogic is out of the InfiniBand business. Is Mellanox
the only game in town these days?

2. Due to the size of my cluster, it looks like buying a just a
core/enterprise IB switch with capacity for ~100 ports is the best
option (I don't expect my cluster to go much bigger than this in the
next 4-5 years).  Based on that criteria, it looks like the Mellanox
IS5100 is my only option. Am I over looking other options?

http://www.mellanox.com/content/pages.php?pg=products_dyn&amp;amp;product_family=71&amp;amp;menu_section=49

3. In my searching yesterday, I didn't find any FDR core/enterprise
switches with &amp;gt; 36 ports, other than the Mellanox SX6536. At 648 ports,
the SX6536is too big for my needs. I've got to be over looking other
products, right?

http://www.mellanox.com/content/pages.php?pg=products_dyn&amp;amp;product_family=122&amp;amp;menu_section=49

4. Adding an additional line card to my existing switch looks like it
will cost me only ~$5,000, and give me the additional capacity I'll need
for the next 1-2 years. I'm thinking it makes sense to do that, and wait
for affordable FDR switches to come out with the port count I'm looking
for instead of upgrading to QDR right now, and start buying hardware
with FDR HCAs in preparation for that.  Please feel free to
agree/disagree. This brings me to my next question...

5. FDR and QDR should be backwards compatible with my existing DDR
hardware, but how exactly does work? If I have, say an FDR switch with a
mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to
the lowest-common denominator, or will the slow-down be based on the two
nodes involved in the communication only? When I googled for an answer,
all I found were marketing documents that guaranteed backwards
compatibility, but didn't go to this level of detail, I searched the
standard spec (v1.2.1), and didn't find an obvious answer to this question.

6. I see some Mellanox docs saying their FDR switches are compliant with
v1.3 of the standard, but the latest version available for download is
1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is
that correct?

&lt;/pre&gt;</description>
    <dc:creator>Prentice Bisbal</dc:creator>
    <dc:date>2012-04-18T15:05:05</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29477">
    <title>2 Security bugs fixed in Grid Engine</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29477</link>
    <description>&lt;pre&gt;There were 2 security related bugs fixed and released in Grid Engine today:

- Code injection via LD_* environment variables
- sgepasswd buffer overflow

Oracle fixed both of them in their CPU (Critical Patch Update) release
for Oracle Grid Engine this afternoon.

For Sun Grid Engine (6.2u5) and Open Grid Scheduler/Grid Engine, visit:

http://gridscheduler.sourceforge.net/security.html

The first one was found by William Hay back in Nov 2011. And the
second one was reported by an outside security researcher to Oracle.
The details of the bug were passed onto me, and we (all the Grid
Engine forks) decided that we should share any security related
information instead of putting it in marketing slides.

Download patches and pre-compiled binaries for:

- SGE 6.2u5, 6.2u5p1, 6.2u5p2
- Open Grid Scheduler/Grid Engine 2011.11

from the URL above.

To apply the patches, just replace the older version of the binaries
with the newer version.

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/
&lt;/pre&gt;</description>
    <dc:creator>Rayson Ho</dc:creator>
    <dc:date>2012-04-18T00:06:23</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29475">
    <title>Ubuntu MAAS</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29475</link>
    <description>&lt;pre&gt;I read a ZDnet article on Ubuntu LTS pitching to be your cloud and data
centre distribution on choice.

It mentions Ubunti Metal-As-A-Service

 

http://www.markshuttleworth.com/archives/1103

 

https://wiki.ubuntu.com/ServerTeam/MAAS/

 

I guess this is what clustering types have been doing for a long time
with various cluster deployment and management suites.

 

Also note Mark Shuttleworths comment about the cost of the OS per node :

"As we enter an era in which ATOM is as important in the data centre as
XEON, an operating system like Ubuntu makes even more sense"

I guess this chimes with the initial Beowulfery spirit - when you have
low-cost nodes, why use an OS (whether it is Windows, Solaris etc)

Which is a significant fraction of the nodes cost.

 

 

John Hearns | CFD Hardware Specialist | McLaren Racing Limited
McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK


T:  +44 (0) 1483 262000

D:  +44 (0) 1483 262352

F:  +44 (0) 1483 261928 
E:  john.hearns&amp;lt; at &amp;gt;mclaren.com

W: www.mclaren.com &amp;lt;http://www.mclaren.com/&amp;gt; 

 


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
&lt;/pre&gt;</description>
    <dc:creator>Hearns, John</dc:creator>
    <dc:date>2012-04-17T15:26:23</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29474">
    <title>openmpi 2.2 standards and infiniband cards</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29474</link>
    <description>&lt;pre&gt;hi,

I'm reading in open mpi 2.2  standards and my eye fell onto something  
amazing.

http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf

chapter 11 "one-sided communications"
page 339:

"it is erroneous to have concurrent conflicting accesses to the same  
memory location in a window"

Does this mean that each update, either read or write in itself is  
atomic with infiniband?

In computerchess it can happen we simply write and read to the same  
locations.
This can result of course in garbled data. Most don't care, some like  
me store a CRC and care even less.
Odds is relative small it happens, but it happens. About once each  
200 billion operations
there is an atomic coincidence that 2 writes happen to the same  
location i measured
  (at Origin3800 &amp;lt; at &amp;gt; 200 cpu's &amp;lt; at &amp;gt; 120GB ram), resulting in garbage  
written at that specific cacheline,
or 2 consecutive cachelines sharing 20 bytes of data (obviously  
usually this last case happens - at
PC hardware actually only the last case can occur and entries garbled  
within 1 cacheline).

Now the actual reads are a byte or 160, from which only 20 bytes will  
get used,
so the statistical odds is a lot larger than this 1 in 200 billion  
that it
occurs that overlapping parts of RAM get requested by 2 or more cores  
at the same time, randomly somewhere
at the cluster and/or writes of 20 bytes that fall within that range.

What's actually happening in hardware here?

As it says further: "if a location is updated by a put or accumulate  
operation, then this location cannot be
accessed by a load or another RMA operation until the updating  
operation has completed."

Well it's gonna happen, not much, but sometimes.

Of course i don't care if there is some slowdown in that once in a  
billion time that 2 or more cores write/read at
the same memory within the window, but i do care when normal  
operations get slowed down by this spec
as given in MPI 2.2 :)

If remote cores ask/write RAM (which usually are different non  
overlapping RMA requests from the RAM)
by put/get a random 20-160 bytes scathered through say a gigabyte of  
RAM of the receiving node,
can the receiving node then issue those say half a dozen random  
lookups/writes to the RAM buffer of a gigabyte
in a concurrent manner?






&lt;/pre&gt;</description>
    <dc:creator>Vincent Diepeveen</dc:creator>
    <dc:date>2012-04-16T03:26:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29473">
    <title>Infiniband Advice which functions to use for what purpose</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29473</link>
    <description>&lt;pre&gt;hi,

Trying to make an new model for infiniband for Diep.
I need some advice which functioncalls/libraries to use for fastest  
possible communication over infiniband (mellanox qdr)
from one node to another.

There is a lot of possibilities there but what's communicating fastest?

I need 2 different types of communication possibly 3 or more.
Still can setup the model there how to communicate now so let's test  
the water:

a) each node has a 1.5GB cache. so that's  1.5 GB * n
      each core of each node is randomly needing 192 bytes. Don't  
know which node in
      advance and don't know where in the gigabytes of cache  
(hashtable) it needs to read.

       what library and which function call is best to ask for this?

     Realize all 8 cores are busy, if i need to keep 1 core free  
handling all requests from all other
      nodes, that slows down each machine significantly as i lose 1  
core then.

b) for starting and stopping the difference cores (at all nodes) in a  
de-centralized manner,
      some variables are difficult to keep decentralized, you want  
them broadcasted to all nodes somehow
      updating shared memory at remote nodes in some sort of manner,  
so the mellanox card writing into the RAM
      without interrupting the probably 8 running cores, nor needing  
any of them to handle this.

      Is that possible somehow? If so, is it possible to update it  
with 1 function call to all n-1 other nodes?

c) memory migration - which possibilities are there to do this - i  
probably need to build a manual memory migration
     when a specific job gets taken over from 1 node to another.  
Which function calls would you advice to use there,
      is there documentation on how to efficiently implement memory  
migration?

  I need to migrate roughly around a 2 kilobyte at a time. This  
doesn't happen too much obviously, yet the algorithms
are so complex i can't avoid doing this if i want the utmost  
performance so i figured out on paper.
And yes i do know there is some stuff that already has this built in  
- but that's possibly too slow for what i need.

d) atomic reads/writes/spinlocks over infiniband. there probably is a  
function to set a lock at a remote memory adress,
      which one is it?
       Is there also a function call that sets a lock, and when lock  
is succesful directly returns you a bunch of bytes from a specific
      adress (nearby the lock); that would avoid me doing the  
procedure first setting a lock. Then sit duck and wait until lock is  
set.
      Then issue that read. Means we ship from node A to B something,  
then when lock set at B, goes back to A. Then A can read its
       bytes finally at B as it has the lock set. Is there a combined  
function that is faster than this and is just directly after it can get
      the lock at B return those bytes to A?

e) when doing the spinlock from A, is the core A.c  that tries to set  
the lock at node B, is that core spinning?
      My previous experience there is that nowadays and/or in past  
when trying to do this, some implementations instead of having your
      core spin for a bunch of microseconds, they put your core to  
idle, which means that it needs to get fired by the runqueue,
      to say it in a simple manner, once again, which again means a  
10-30 milliseconds delay until it has received that data.
      Do cores get put in prison for up to 30 years when trying to  
set a lock with the function call in D, do i have both options or am  
i so lucky?


Many thanks for taking a look at my questions and even more to those  
responding!

Kind Regards,
Vincent





&lt;/pre&gt;</description>
    <dc:creator>Vincent Diepeveen</dc:creator>
    <dc:date>2012-04-10T00:14:52</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29461">
    <title>Nvidia's quantum leap in 28 nm</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29461</link>
    <description>&lt;pre&gt;It's been some year or 12 that a genius visited me. His expertise  
being the same like Einsteins,
  it's not much of a question what his research topics were.

Though not deep into computer hardware he told me that for massive  
computing, just above the 1Ghz
border would prove to be a big barrier as electrons basically move at  
around 1/3 of the lightspeed, which
translates to 1.3Ghz in metals like aluminium. At copper so he said  
that barrier might be a tad higher
than aluminium, yet even then the power needed for such speeds would  
prove to be massive.

At that moment intel's marketing department shouted out loud their  
P4's would clock 10Ghz by 2010.

Well the P4 never got there and we got into the megacore count game  
for HPC.

AMD
Now AMD needs 4 PE's for doing double precision, so their core count  
of 1536 actually wasn't more than the 5000 series
with 1600. Their new 7970 gpu with 2048 pe's has the double precision  
equivalent in core count of 512 compute cores.

Actually the 7970 mostly profits from a 100Mhz higher frequency with  
some boosting to 1Ghz at some overclocked cards,
it gets impressive game scores. As for gpgpu of course, moving from  
1536 cores to 2048 is an interesting improvement,
yet far away from a doubling. The 7970 is said to have around 4.31B  
transistors
(see http://www.anandtech.com/show/5261/amd-radeon-hd-7970-review )

NVIDIA FERMI
Fermi, nvidia's 40 nm gpu which currently gets used in HPC, it has 3  
bilion transistors.
Here at home i have a few 2075 Tesla's with 448 cores producing a tad  
more than 0.5 Tflop
which was its a big improvement over the previous generation.

The Nvidia Fermi on the other hand in the form of the GTX 560 clocks  
1.644Ghz and the 580 clocks 1.544Ghz.
For gpgpu this is on the risky side as getting far over that 1Ghz  
seems to be a problem. The tesla's therefore are clocked
safely 1.15Ghz

NVIDIA KEPLER 2012
The new kid on the block from Nvidia is the Kepler. It's in the 28 nm  
proces technology, just like AMD's 7970.
Now i'm not gonna redo a review for games, there is great sites for  
that.

http://www.anandtech.com/show/5699/nvidia-geforce-gtx-680-review/1

Over here we are interested in the implications for the beowulf  
systems of course, i read that as HPC implications.
Let's look to facts and then speculate what that means for HPC:

I'm still trying to full understand the differences, yet it seems as  
if nvidia clocked back to 1Ghz the cores. That should make
it easier to release a gpu for gpgpu as well. In the meantime core  
count went up to 1536.

The chip itself has 3.5 billion transistors. Just 500M more than  
Fermi, meanwhile at a factor 2.04 smaller proces,
that means it will consume less juice and a lot less juice.  
Benchmarks at anandtech confirm this.

Now that's a MASSIVE quantumleap. Basically factor 3 the number of  
cores available to HPC.

Additional to that the memory is 256 bits wide, versus 384 bits for  
Fermi. This should make it easier to release 2 gpu's on a single card.
Whether nvidia has those plans for gpgpu tesla's we can only  
speculate about, as the chip eats less juice, it sure fits this time  
within the
power envelope. So where the gamer kids with sureness can expect a  
690 gpu, for HPC we of course cheer if nvidia manages to
improve to 1.5 - 1.7 Tflop for their new gpu, with the option to move  
to 3 - 3.4 Tflop double precision for a 2 gpu Tesla card.

Note that some might argue that the 680 has less double precision  
capabilities than the 580. However for the Tesla this doesn't matter,
as what happens for gamerscards is that they disable some  
transistors; so the Tesla gpu will be the exact same chip like the  
kids has,
just with the double precision enabled. The same thing was the case  
with Fermi, so it's logical to expect that to happen with Kepler as  
well.

Seems like intel can also scrap their current corner project as they  
have a new goal, namely 4 Tflop, rather than a 1 Tflop manycore :)

As for Nvidia, releasing a new chip that's factor 3 the power of your  
previous one for gpgpu sure is a big quantum leap!










&lt;/pre&gt;</description>
    <dc:creator>Vincent Diepeveen</dc:creator>
    <dc:date>2012-03-25T18:47:00</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29447">
    <title>Google greywater cooling</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29447</link>
    <description>&lt;pre&gt;Flagging up yet another Register article I'm afraid, but it is
interesting

 

http://www.theregister.co.uk/2012/03/20/google_greywater_data_center_coo
ling/


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
&lt;/pre&gt;</description>
    <dc:creator>Hearns, John</dc:creator>
    <dc:date>2012-03-21T14:11:51</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29438">
    <title>oil immersion cooled blades</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29438</link>
    <description>&lt;pre&gt;
http://www.heise.de/newsticker/meldung/Server-Blades-in-Oel-1471734.html

(translation courtesy Google Translate):

Server blades in oil

Hardcore Computer LSS 200

Photo: Boston The U.S. company has with the hardcore computer Submerged
Liquid server developed (LSS 200), a server blade in the format of immersion
cooling uses: The entire unit sits in a closed housing which oil flows. This
"Core Coolant" is according to the safety of a non-toxic and biodegradable
compound based on a synthetic wax. The advantages of this cooling method is
called Hardcore Computer eliminated including a more efficient cooling,
because the coolant can be for example of a data center transported directly
to heat exchangers, and going through cold air. Because no special rack
delivers warm air directly into the environment, it can also be operated in
locations without air conditioning.

The server manufacturer in Boston, the LSS 200 is added to its product line
and offers it in Germany. However, neither prices nor called Boston delivery,
and plugged in the server rack now slightly dusty Technology: In comparison
to the recently announced Xeon E5600 Xeon E5 falls off significantly. Also,
the data sheet (PDF file) of the CLS 200 with hardcore computer can open
questions. For example, detailed information is lacking on the power
supplies, the special chassis and to the oil-cooling equipment. The CLS 200
will also allow the use of a PCIe expansion card, such as a Tesla accelerator
or InfiniBand adapter card - if it works well both at the same time remains
uncertain. Finally, the question remains open on the disk, at least be
mentioned only SSDs in 2.5-inch format.

The entire board is surrounded by oil.  Picture: Hardcore Computer

In the United States sold the hardcore computer desktops and Reactor Reactor
X, and the detonator Workstation with Immersion cooling. Oil as a coolant has
a lower specific heat capacity than water, but cools better than air and
leaks caused by short circuits. When complete immersion of the coolant
reaches all the components, while in the water cooling individual heat sinks
are needed which do not reach all critical components of any assemblies.
Optimal cooling water can be exploited if the motherboard design optimized to
be. ( ciw ) 
&lt;/pre&gt;</description>
    <dc:creator>Eugen Leitl</dc:creator>
    <dc:date>2012-03-14T15:22:24</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29418">
    <title>Supercomputers - iPad versus Cray</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29418</link>
    <description>&lt;pre&gt;http://www.theregister.co.uk/2012/03/08/supercomputing_vs_home_usage/

 

A rather nice Register article on costs for supercomputers, adjusted to
2010  dollars,

And a rather interesting cost per megaflop table on the second page.


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
&lt;/pre&gt;</description>
    <dc:creator>Hearns, John</dc:creator>
    <dc:date>2012-03-08T13:43:10</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29413">
    <title>seamicro fabric?</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29413</link>
    <description>&lt;pre&gt;Has anyone found an informative description of the Seamicro fabric?
various heavily masticated reports on the web say it's a 3d torus,
"low latency" and 160 GB/s (per link?  bisecection?  in-chassis?)

thanks, mark.
&lt;/pre&gt;</description>
    <dc:creator>Mark Hahn</dc:creator>
    <dc:date>2012-03-04T19:35:59</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29407">
    <title>Pbsnodes xml format</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29407</link>
    <description>&lt;pre&gt;Slightly off topic.

Would some kind soul who is running Torque send me some sample output
from   pbsnodes -x

 

Thanks


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.
&lt;/pre&gt;</description>
    <dc:creator>Hearns, John</dc:creator>
    <dc:date>2012-03-01T11:37:58</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29406">
    <title>LSF Job Preemption &amp; Checkpointing</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29406</link>
    <description>&lt;pre&gt;On Thu, Mar 1, 2012 at 2:20 AM, Jan Wender
&amp;lt;j.wender&amp;lt; at &amp;gt;science-computing.de&amp;gt; wrote:

IMO, LSF has the best job preemption &amp;amp; checkpointing support, with the
least integration effort needed from the end user &amp;amp; cluster
administrator. And resource preemption and license preemption are the
more advanced features of LSF.

(There are more manual configuration needed for Grid Engine &amp;amp; Open
Grid Scheduler and/or other batch systems - not impossible, but needs
knowledge on how to tune the scheduler.)



There are 3 types of checkpointing supported by LSF:

1) kernel-level
2) user-level
3) application-level

Kernel level is easy, the OS kernel handles everything for the user
(for interactively processes) &amp;amp; the batch system (for jobs).

However, only IRIX, Cray UNICOS, and NEC SUPER-UX support kernel-level
checkpointing.

On Linux, you usually need to patch the kernel:

 - "Checkpoint/restart: it's complicated": http://lwn.net/Articles/414264/
 - "Kernel-based checkpoint and restart": http://lwn.net/Articles/293575/

(Lots of discussions on kernel-level checkpointing in the past few
years but still we don't have anything in the official tree yet...)

Or even kernel assisted user-level checkpointing:

 - "Preparing for user-space checkpoint/restore":
http://lwn.net/Articles/478111/

And there is also the famous Berkeley Lab Checkpoint/Restart (BLCR),
which is a kernel module and thus you can use your distribution's
stock kernel:

 - "RCE 12: BLCR": http://www.rce-cast.com/Podcast/rce-12-blcr.html

 - "Checkpointing under Linux with Berkeley Lab Checkpoint/Restart":
http://gridscheduler.sourceforge.net/howto/APSTC-TB-2004-005.pdf


For user-level, you will need to link against a checkpointing library
shipped with LSF, which (I think) has some object file level init
routines that perform initializations to properly save the state of
stuff and also need to wrap around standard libc functions &amp;amp; system
calls (I forgot the actual details, lots of academic papers published
15 years ago and I recall reading a few of them, but just don't recall
the content :-D ).

See "Standalone Checkpointing":
http://research.cs.wisc.edu/condor/checkpointing.html

With user-level checkpointing &amp;amp; restart, you usually need to relink
your application (unless you use the LD_PRELOAD trick). So for
operating systems that don't support kernel-level checkpointing (ie.
most of the OSes), user-level checkpointing usually works for most
general applications (I *think* Platform Computing even ported the LSF
checkpointing library to Windows as well - or at least that's what I
was told).


For application-level checkpointing, the applications will handle
everything. But of course each application needs to have its own
built-in support for checkpoint &amp;amp; restart.

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/



&lt;/pre&gt;</description>
    <dc:creator>Rayson Ho</dc:creator>
    <dc:date>2012-03-01T08:22:16</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29403">
    <title>Functionality of schedulers</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29403</link>
    <description>&lt;pre&gt;Hi list!

Is there any scheduler which has the functionality to automatically put 
a running job on hold when another job with higher priority is submitted?

Preferably the state of the first job should be frozen, and saved to 
disk, so that it can be restarted again when the higher priority job has 
finished.

Is this at all possible (we are using torque/maui, and I couldn't find 
this feature there)?

Regards,

/jon
&lt;/pre&gt;</description>
    <dc:creator>Jon Tegner</dc:creator>
    <dc:date>2012-03-01T06:52:36</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29402">
    <title>amd buys seamicro</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29402</link>
    <description>&lt;pre&gt;http://www.eetimes.com/electronics-news/4237271/AMD-to-buy-microserver-startup-SeaMicro

this is interesting.  most of the coverage seems to interpret this as 
using opterons (which makes some sense, given the direction bulldozer
is going, towards lots of space/power-effective cores.)

but here's another prospect: a box with lots of APU chips that max out
GPU density.  lotsa gflops/watt, very compact...

seamicro says their interconnect is special, low-lat, high-bw,
but it sounds like an onboard 10Gb chip to me.  calxeda's onboard
distributed switch might be more interesting.

regards, mark hahn.
&lt;/pre&gt;</description>
    <dc:creator>Mark Hahn</dc:creator>
    <dc:date>2012-03-01T06:16:49</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29399">
    <title>Raspberry Pi</title>
    <link>http://comments.gmane.org/gmane.comp.clustering.beowulf.general/29399</link>
    <description>&lt;pre&gt;thought some people here should see the Raspberry Pi --

$35 computer with Toronto-designed software sells out worldwide in minutes

        http://bit.ly/xDVEub  [takes you to thestar.com]

Say hi to the Raspberry Pi, the $35 computer (with photo)

        http://bit.ly/xDe8fJ [takes you to csmonitor.com]
&lt;/pre&gt;</description>
    <dc:creator>Douglas J. Trainor</dc:creator>
    <dc:date>2012-02-29T21:40:52</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.clustering.beowulf.general">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.clustering.beowulf.general</link>
  </textinput>
</rdf:RDF>

