<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://blog.gmane.org/gmane.comp.java.clojure.storm">
    <title>gmane.comp.java.clojure.storm</title>
    <link>http://blog.gmane.org/gmane.comp.java.clojure.storm</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9678"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9676"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9673"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9670"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9654"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9653"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9651"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9649"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9641"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9638"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9631"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9629"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9620"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9618"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9617"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9602"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9596"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9594"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9593"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.comp.java.clojure.storm/9590"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9678">
    <title>Aggregating all of the Storm Logs</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9678</link>
    <description>&lt;pre&gt;With the large quantity of log files produced by storm by default, I was 
wondering how others were rolling up the logs into something that can be 
easily monitored for ERROR messages.

&lt;/pre&gt;</description>
    <dc:creator>Gary Malouf</dc:creator>
    <dc:date>2013-05-21T21:08:58</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9676">
    <title>0.8.2 setup</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9676</link>
    <description>&lt;pre&gt;I’m trying to set up a 3-node system on 0.8.2.
I want one node to be ui, nimbus and a supervisor (vmddcdmaster)
I want the others to be just supervisors (vmddcdnode1, vmddcdnode2)
I’m going to write a customer scheduler as my workload is pretty well known 
so node1&amp;amp;node2 will be running jobs.
The master will just be the top node which coordinates things.

So..I have this same storm.yaml on all 3 nodes.  I run nimbus, supervisor, 
and ui on the master and just supervisor on the 2 nodes.

Master:
supervisor.slots.ports:
     - 6700
     - 6701
     - 6702
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
     - "vmddcdmaster"
     - "vmddcdnode1"
     - "vmddcdnode2"
#
storm.local.dir: "/var/storm"
nimbus.host: "vmddcdmaster"

The UI shows only node1 in the Supervisor summary – master showed up for a 
short while then disappeared.  I expected all 3 machines to show up.
I imagine I’m doing something wrong but it’s not clear to me what is wrong 
and there are no error messages in any logs.

Here is the master supervisor.log – it doesn’t even try to connect to node1 
even though it’s in the connectString.

2013-05-21 12:16:25 ZooKeeper [INFO] Initiating client connection, 
connectString=vmddcdmaster:2181,vmddcdnode1:2181,vmddcdnode2:2181 
sessionTimeout=20000 watcher=com.netflix.curator.ConnectionState&amp;lt; at &amp;gt;4be2e35a
2013-05-21 12:16:25 ClientCnxn [INFO] Opening socket connection to server 
vmddcdmaster/10.2.131.233:2181
2013-05-21 12:16:25 ClientCnxn [INFO] Socket connection established to 
vmddcdmaster/10.2.131.233:2181, initiating session
2013-05-21 12:16:25 ClientCnxn [INFO] Session establishment complete on 
server vmddcdmaster/10.2.131.233:2181, sessionid = 0x136c1ca59230020, 
negotiated timeout = 20000
2013-05-21 12:16:25 zookeeper [INFO] Zookeeper state update: :connected:none
2013-05-21 12:16:25 ZooKeeper [INFO] Session: 0x136c1ca59230020 closed
2013-05-21 12:16:25 ClientCnxn [INFO] EventThread shut down
2013-05-21 12:16:25 CuratorFrameworkImpl [INFO] Starting
2013-05-21 12:16:25 ZooKeeper [INFO] Initiating client connection, 
connectString=vmddcdmaster:2181,vmddcdnode1:2181,vmddcdnode2:2181/storm 
sessionTimeout=20000 watcher=com.netflix.curator.ConnectionState&amp;lt; at &amp;gt;818f289
2013-05-21 12:16:25 ClientCnxn [INFO] Opening socket connection to server 
vmddcdnode2/10.2.131.236:2181
2013-05-21 12:16:25 ClientCnxn [INFO] Socket connection established to 
vmddcdnode2/10.2.131.236:2181, initiating session
2013-05-21 12:16:25 ClientCnxn [INFO] Session establishment complete on 
server vmddcdnode2/10.2.131.236:2181, sessionid = 0x13ec3014c71000e, 
negotiated timeout = 20000
2013-05-21 12:16:25 supervisor [INFO] Starting supervisor with id 
9d7aad01-02a6-4bab-8106-0c1e2b2bd883 at host vmddcdmaster.essexcorp.com

Zookeeper.log shows the opposite problem – never tries to connect to node2
2013-05-21 12:16:25 ZooKeeper [INFO] Initiating client connection, 
connectString=vmddcdmaster:2181,vmddcdnode1:2181,vmddcdnode2:2181 
sessionTimeout=20000 watcher=com.netflix.curator.ConnectionState&amp;lt; at &amp;gt;4db03533
2013-05-21 12:16:25 ClientCnxn [INFO] Opening socket connection to server 
vmddcdnode1/10.2.131.235:2181
2013-05-21 12:16:25 ClientCnxn [INFO] Socket connection established to 
vmddcdnode1/10.2.131.235:2181, initiating session
2013-05-21 12:16:25 ClientCnxn [INFO] Session establishment complete on 
server vmddcdnode1/10.2.131.235:2181, sessionid = 0x13ec2fe52f2000c, 
negotiated timeout = 20000
2013-05-21 12:16:25 zookeeper [INFO] Zookeeper state update: :connected:none
2013-05-21 12:16:25 ZooKeeper [INFO] Session: 0x13ec2fe52f2000c closed
2013-05-21 12:16:25 ClientCnxn [INFO] EventThread shut down
2013-05-21 12:16:25 CuratorFrameworkImpl [INFO] Starting
2013-05-21 12:16:25 ZooKeeper [INFO] Initiating client connection, 
connectString=vmddcdmaster:2181,vmddcdnode1:2181,vmddcdnode2:2181/storm 
sessionTimeout=20000 watcher=com.netflix.curator.ConnectionState&amp;lt; at &amp;gt;2991c9a6
2013-05-21 12:16:25 ClientCnxn [INFO] Opening socket connection to server 
vmddcdmaster/10.2.131.233:2181
2013-05-21 12:16:25 ClientCnxn [INFO] Socket connection established to 
vmddcdmaster/10.2.131.233:2181, initiating session
2013-05-21 12:16:25 ClientCnxn [INFO] Session establishment complete on 
server vmddcdmaster/10.2.131.233:2181, sessionid = 0x136c1ca5923001f, 
negotiated timeout = 20000
2013-05-21 12:16:25 nimbus [INFO] Starting Nimbus server...

&lt;/pre&gt;</description>
    <dc:creator>Michael Black</dc:creator>
    <dc:date>2013-05-21T20:19:54</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9673">
    <title>python DRPC client</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9673</link>
    <description>&lt;pre&gt;I've noticed that

https://github.com/nathanmarz/storm/tree/master/storm-core/src/py

seems to contains a python DRPC client, looks like generated from thrift, 
but I haven't seen any docs on it or heard of folks using it.

Has anybody had any experience using it?

&lt;/pre&gt;</description>
    <dc:creator>Homer Strong</dc:creator>
    <dc:date>2013-05-21T19:07:08</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9670">
    <title>ConnectionRefusedException submitting a topology to a remote cluster</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9670</link>
    <description>&lt;pre&gt;Hi everybody

I am able to submit a little topology on remote cluster, but I do 
everything in localhost.
Then I'd like to:
1)submit a topology to an AWS
2)submit the topology to a cluster of Virtual Machines.

When I launch the storm jar command from the machine I get 
ConnectionRefusedException and I don't understand why.
For 1) I just created a cluster thanks to storm-deploy. Then I attach to 
the cluster but I get exception running storm jar
For 2) I have one Virtual machine where I start nimbus and zookeeper. I 
change the IP addresses in the storm.yaml file of the machine from where I 
want to submit the topology with no success.

In both cases I got ConnectionRefusedException

Any ideas why?

Then where I have to put the storm.yaml file? in ~/.storm/storm.yaml or 
where I have downloaded the storm release? 
By the way I tried in every place but I'm not able to submit a topology to 
a remote cluster, but I repeat I do
in a cluster in localhost.

Pleae help me to understand what is wrong

Regards,
Amedeo

&lt;/pre&gt;</description>
    <dc:creator>Amedeo Merlo</dc:creator>
    <dc:date>2013-05-21T16:47:29</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9654">
    <title>Nimbus tasks keep timing out</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9654</link>
    <description>&lt;pre&gt;A little after deploying, I see the worker processes restarting and the 
zookeeper session timing out if not all tasks started in time. How do I 
resolve this issue?

2013-05-19 13:40:17 nimbus [INFO] Activating top: top-6-1368970817
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:37 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:7 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:13 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:19 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:23 timed out
2013-05-19 13:42:22 nimbus [INFO] Reassigning top-6-1368970817 to 26 slots
2013-05-19 13:42:22 nimbus [INFO] Reassign ids: [7 10 11 12 13 14 16 18 19 
20 22 23 24 25 26 28 30 37 40 41 42 43 44 46 48 49 50 52 53 54 55 56 58 60 
67 70 71
 72 73 74 76 78 79 80 82 83 84 85 86 88 90 97]
2013-05-19 13:42:22 nimbus [INFO] Available slots: ()
2013-05-19 13:42:22 nimbus [INFO] Setting new assignment for storm id 
top-6-1368970817:

Is the worker process taking too long to start up? I see each task taking 
about 5-6 seconds. All the tasks are multilang processes.

&lt;/pre&gt;</description>
    <dc:creator>Vinod</dc:creator>
    <dc:date>2013-05-20T22:58:31</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9653">
    <title>Topology conf is not json-serializable</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9653</link>
    <description>&lt;pre&gt;Hi all,

I got an "java.lang.IllegalArgumentException: Topology conf is not 
json-serializable" error when I submitted my topology.

Here's my topology definition in main class:

                TopologyBuilder builder = new TopologyBuilder();
builder.setSpout("file-reader",new File_emmitter());
builder.setBolt("processor", new 
Predict_stream()).shuffleGrouping("file-reader");
Config conf = new Config();
                conf.put("wordsFile", args[0]); // input stream file
conf.put("knowledge", args[1]); // file on some network file share, must 
allow guest access
conf.setDebug(false);
//Topology run
conf.put(Config.TOPOLOGY_MAX_SPOUT_PENDING, 1);
LocalCluster cluster = new LocalCluster();
*cluster.submitTopology("Stream Prediction", 
conf,builder.createTopology()); // Error raised here as per message*
*
*
Help appreciated.

Cheers,
Sanjai

&lt;/pre&gt;</description>
    <dc:creator>Sanjai Nair</dc:creator>
    <dc:date>2013-05-20T22:14:52</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9651">
    <title>transaction topology throws:  org.apache.zookeeper.KeeperException$NoNodeException:</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9651</link>
    <description>&lt;pre&gt;HI, all
    I ran into this PROBLEM all the time, i googled and found 
https://groups.google.com/forum/#!msg/storm-user/Tn43K1eGcKY/Gag286Jg3H8J , 
in this topic &amp;lt; at &amp;gt;nathanmarz said it may be caused by using the same id for 
the spout in multiple topologies. but i just have one topology running and 
the txid are genetated by prefix + date. like 
"kafkaspout20130520145047/coordinator". I am confused. 

i use kafka0.7.2 and storm 0.8.2 and zk3.4.3. 
here is what i construct builder:
*TransactionalTopologyBuilder builder = new 
TransactionalTopologyBuilder("log" + uid, "kafkaspout" + uid, new 
OpaqueTransactionalKafkaSpout(kafkaConf), 2);*


here is the detail:


coodinate exception:
                            java.lang.RuntimeException: 
java.lang.RuntimeException: 
org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
NoNode for /meta/1470191 at backtype.storm.utils.DisruptorQueue.consume

real_spout exception:
Mon, 20 May 2013 07:16:36 +0000

java.lang.RuntimeException: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /1/19572
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:87)
at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:58)
at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:62)
at backtype.storm.daemon.executor$fn__4050$fn__4059$fn__4106.invoke(executor.clj:658)
at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.RuntimeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /1/19572
at backtype.storm.transactional.state.TransactionalState.delete(TransactionalState.java:79)
at backtype.storm.transactional.state.RotatingTransactionalState.cleanupBefore(RotatingTransactionalState.java:112)
at backtype.storm.transactional.partitioned.OpaquePartitionedTransactionalSpoutExecutor$Emitter.cleanupBefore(OpaquePartitionedTransactionalSpoutExecutor.java:98)
at backtype.storm.transactional.TransactionalSpoutBatchExecutor.execute(TransactionalSpoutBatchExecutor.java:55)
at backtype.storm.coordination.CoordinatedBolt.execute(CoordinatedBolt.java:307)
at backtype.storm.daemon.executor$fn__4050$tuple_action_fn__4052.invoke(executor.clj:566)
at backtype.storm.daemon.executor$mk_task_receiver$fn__3976.invoke(executor.clj:345)
at backtype.storm.disruptor$clojure_handler$reify__1606.onEvent(disruptor.clj:43)
at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:84)
... 6 more
Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /1/19572
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
at com.netflix.curator.framework.imps.DeleteBuilderImpl$2.call(DeleteBuilderImpl.java:129)
at com.netflix.curator.framework.imps.DeleteBuilderImpl$2.call(DeleteBuilderImpl.java:125)
at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:85)
at com.netflix.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:121)
at com.netflix.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:113)
at com.netflix.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:32)
at backtype.storm.transactional.state.TransactionalState.delete(TransactionalState.java:77)
... 14 more

&lt;/pre&gt;</description>
    <dc:creator>art0chu-Re5JQEeQqe8AvxtiuMwx3w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2013-05-20T07:34:46</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9649">
    <title>Nimbus tasks keep timing out with higher #task / #worker ratio</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9649</link>
    <description>&lt;pre&gt;I'm running storm on 2 large AWS instances at the moment, and it takes 
about a minute to load all the tasks (100 multilang tasks spread across 30 
workers). I'm hitting issues where when I increase the number of tasks 
beyond around 100, zookeeper logs show that it thinks sessions are timing 
out. nimbus also shows a subset of the tasks having timed out.

I increased the timeouts to 40 seconds hoping that it would work as a 
temporary solution. Here are the relevant sections of the zookeeper log:

2013-05-19 13:41:18,062 - INFO  [SyncThread:0:NIOServerCnxn&amp;lt; at &amp;gt;1580] - 
Established session 0x13e9ddb71a50308 with negotiated timeout 40000 for 
client /10.200.20\
0.200:48274
[...]
2013-05-19 13:43:31,347 - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn&amp;lt; at &amp;gt;634] - 
EndOfStreamException: Unable to read additional data from client sessionid 
0x13e9ddb71a50308, likely client has closed socket
2013-05-19 13:43:31,348 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn&amp;lt; at &amp;gt;1435] - Closed 
socket connection for client /10.200.200.200:48274 which had sessionid 
0x13e9ddb71a50308
[...]
2013-05-19 13:44:12,000 - INFO  [SessionTracker:ZooKeeperServer&amp;lt; at &amp;gt;316] - 
Expiring session 0x13e9ddb71a50308, timeout of 40000ms exceeded
2013-05-19 13:44:12,000 - INFO  [ProcessThread:-1:PrepRequestProcessor&amp;lt; at &amp;gt;399] 
- Processed session termination for sessionid: 0x13e9ddb71a50308

nimbus logs around this time:

013-05-19 13:40:17 nimbus [INFO] Activating file: top-6-1368970817
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:37 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:7 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:13 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:19 timed out
2013-05-19 13:42:22 nimbus [INFO] Task top-6-1368970817:23 timed out
2013-05-19 13:42:22 nimbus [INFO] Reassigning top-6-1368970817 to 26 slots
2013-05-19 13:42:22 nimbus [INFO] Reassign ids: [7 10 11 12 13 14 16 18 19 
20 22 23 24 25 26 28 30 37 40 41 42 43 44 46 48 49 50 52 53 54 55 56 58 60 
67 70 71
 72 73 74 76 78 79 80 82 83 84 85 86 88 90 97]
2013-05-19 13:42:22 nimbus [INFO] Available slots: ()
2013-05-19 13:42:22 nimbus [INFO] Setting new assignment for storm id 
top-6-1368970817:

This causes storm workers to repeatedly restart till I reduce the 
parallelism enough for timeouts not to occur. Checking tasks, I see that 
each one takes 4-7 seconds to finish loading but can sometimes take much 
longer (they are all python/ruby multilang processes). Would this be the 
cause of the timeouts? Below is a partial log of a worker successfully 
starting up, allowing you to see the timestamps:

2013-05-18 18:13:16 ZooKeeper [INFO] Initiating client connection, 
connectString=zookeeper.domain.com:2181/storm sessionTimeout=120000 
watcher=com.netflix.
curator.ConnectionState&amp;lt; at &amp;gt;11fb24d3
2013-05-18 18:13:16 ClientCnxn [INFO] Opening socket connection to server 
zookeeper.domain.com/10.200.200.201:2181
2013-05-18 18:13:16 ClientCnxn [INFO] Socket connection established to 
zookeeper.domain.com/10.200.200.201:2181, initiating session
2013-05-18 18:13:16 ClientCnxn [INFO] Session establishment complete on 
server zookeeper.domain.com/10.200.200.201:2181, sessionid = 
0x13e9ddb71a502ed, negotiated timeout = 40000
2013-05-18 18:13:17 task [INFO] Loading task first_bolt:64
2013-05-18 18:13:17 task [INFO] Preparing bolt first_bolt:64
2013-05-18 18:13:21 ShellBolt [INFO] Launched subprocess with pid 29132
2013-05-18 18:13:21 task [INFO] Prepared bolt first_bolt:64
2013-05-18 18:13:21 task [INFO] Finished loading task first_bolt:64
2013-05-18 18:13:21 task [INFO] Loading task second_bolt:1
2013-05-18 18:13:21 task [INFO] Preparing bolt second_bolt:1
2013-05-18 18:13:22 ShellBolt [INFO] Launched subprocess with pid 29301
2013-05-18 18:13:22 task [INFO] Prepared bolt second_bolt:1
2013-05-18 18:13:22 task [INFO] Finished loading task second_bolt:1
2013-05-18 18:13:22 task [INFO] Loading task third_bolt:33
2013-05-18 18:13:22 task [INFO] Preparing bolt third_bolt:33
2013-05-18 18:13:23 ShellBolt [INFO] Launched subprocess with pid 29459
2013-05-18 18:13:23 task [INFO] Prepared bolt third_bolt:33
2013-05-18 18:13:23 task [INFO] Finished loading task third_bolt:33
2013-05-18 18:13:23 task [INFO] Loading task first_spout:94
2013-05-18 18:13:24 task [INFO] Opening spout first_spout:94
2013-05-18 18:13:28 ShellSpout [INFO] Launched subprocess with pid 29619
2013-05-18 18:13:28 task [INFO] Opened spout first_spout:94
2013-05-18 18:13:28 task [INFO] Activating spout first_spout:94
2013-05-18 18:13:28 task [INFO] Finished loading task first_spout:94
2013-05-18 18:13:28 worker [INFO] Launching receive-thread for 
336cb3ef-802a-4f8e-9fac-7558867230f0:6712
2013-05-18 18:13:28 worker [INFO] Worker has topology config {...}
2013-05-18 18:13:28 worker [INFO] Worker 
9c588d44-2b99-4a79-a265-2c510de6e192 for storm top-6-1368970817 on 
336cb3ef-802a-4f8e-9fac-7558867230f0:6712 has finished loading

Any help avoiding timeouts while scaling up the number of tasks would be 
appreciated. Running storm locally in one process starts up the tasks much 
faster, for what it's worth.

Thanks!

&lt;/pre&gt;</description>
    <dc:creator>Vinod</dc:creator>
    <dc:date>2013-05-20T02:43:27</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9641">
    <title>storm-kafka : weird behavior with offsets</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9641</link>
    <description>&lt;pre&gt;Hi,

I am running a test topology in cluster mode with kafka as my queue and
processing a fairly complex topology using the OpaqueTridentKafkaSpout. It
takes anywhere between 2-3 seconds for a batch to commit and the
MAX_SPOUT_PENDING is set to 4.

My queue consists of data over multiple days (infact months). I checked
that one of the days finished fine and the processing had moved on to the
next day. After a few minutes, instead of moving to the next day it moved
back to the previously processed day. I have never observed this strange
behavior and I ran a very simple kafka consumer to make sure none of my
partitions contain interlaced data.

Has anyone seen this kind of behavior ?

*Environment*
storm: 0.9.0.wip16
kafka: 0.7.2
storm-kafka: 0.9.0.wip15

Thanks,
Viral

&lt;/pre&gt;</description>
    <dc:creator>Viral Bajaria</dc:creator>
    <dc:date>2013-05-20T10:14:29</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9638">
    <title>ERROR when submitting 0.9 wip 16 in cluster :submit topologies using the 'storm' client script so that StormSubmitter knows which jar to upload.</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9638</link>
    <description>&lt;pre&gt;HI

I recently upgraded to storm 0.9 wip 16 to run a storm kafka project.

My code works fine with storm on local cluster. But when i compile using

                            mvn -f m2-pom.xml compile exec:java 
-Dexec.classpathScope=compile -Dexec.mainClass=storm.starter.MainTopology 
-Dexec.args="test"

I get the following error.... 
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
565  [storm.starter.MainTopology.main()] INFO  backtype.storm.StormSubmitter 
- Jar not uploaded to master yet. Submitting jar...
[WARNING] 
java.lang.reflect.InvocationTargetException
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:57)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:601)
       at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
       at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.RuntimeException: Must submit topologies using the 
'storm' client script so that StormSubmitter knows which jar to upload.
       at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:131)
       at backtype.storm.StormSubmitter.submitJar(StormSubmitter.java:123)
       at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.
java:74)
       at backtype.storm.StormSubmitter.submitTopology(StormSubmitter.
java:41)
       at storm.starter.MainTopology.main(MainTopology.java:54)
       ... 6 more
[INFO] 
------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] 
------------------------------------------------------------------------
[INFO] An exception occured while executing the Java class. null

Must submit topologies using the 'storm' client script so that 
StormSubmitter knows which jar to upload.


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Please help .

Regards

Chitra

&lt;/pre&gt;</description>
    <dc:creator>ChitraR</dc:creator>
    <dc:date>2013-05-20T09:16:46</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9631">
    <title>Upgrade Storm HttpClient Version</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9631</link>
    <description>&lt;pre&gt;I am using a library that requires a later version of httpclient then what 
is shipped with Storm (specifically 4.3).  What is the most appropriate way 
to move to this new version?  Is anyone aware of conflicts if we modify 
storm to use this?  Are there any other ways that we can do this without 
modifying storms default libraries?  Any help is appreciated.

&lt;/pre&gt;</description>
    <dc:creator>Jamie Johnson</dc:creator>
    <dc:date>2013-05-17T21:39:48</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9629">
    <title>EvenScheduler don`t work after my custom schedule</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9629</link>
    <description>&lt;pre&gt;After assigning 2 special executors to 2 slots of a special supervisor,
new EvenScheduler().schedule(topologies, cluster) don`t assign the 2 
non-special executors and the "__acker" to slots

By the way there are some free slots

&lt;/pre&gt;</description>
    <dc:creator>杨昆</dc:creator>
    <dc:date>2013-05-17T14:18:07</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9620">
    <title>Nimbus not starting up</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9620</link>
    <description>&lt;pre&gt;Hi,

I have Zookeeper running on 2181 (ports are opened):

*hduser&amp;lt; at &amp;gt;hduser-laptop:~/.storm$* netstat -a | grep -e "2181"
tcp        0      0 *:2181                  *:*                     LISTEN 
    

But when I tried to kick start nimbus, then I got the below exception in 
the nimbus log file:

2013-05-17 19:17:12 ZooKeeper [INFO] Initiating client connection, 
connectString=localhost:2181 sessionTimeout=20000 
watcher=com.netflix.curator.ConnectionState&amp;lt; at &amp;gt;53e6978d
2013-05-17 19:17:12 ClientCnxn [INFO] Opening socket connection to server 
localhost/127.0.0.1:2181
2013-05-17 19:17:12 ClientCnxn [INFO] Socket connection established to 
localhost/127.0.0.1:2181, initiating session
2013-05-17 19:17:13 ClientCnxn [INFO] Session establishment complete on 
server localhost/127.0.0.1:2181, sessionid = 0x13eb48fd6420000, negotiated 
timeout = 20000
2013-05-17 19:17:13 zookeeper [INFO] Zookeeper state update: :connected:none
2013-05-17 19:17:13 ZooKeeper [INFO] Session: 0x13eb48fd6420000 closed
2013-05-17 19:17:13 CuratorFrameworkImpl [INFO] Starting
2013-05-17 19:17:13 ZooKeeper [INFO] Initiating client connection, 
connectString=localhost:2181/storm sessionTimeout=20000 
watcher=com.netflix.curator.ConnectionState&amp;lt; at &amp;gt;11c0b8a0
2013-05-17 19:17:13 ClientCnxn [INFO] Opening socket connection to server 
localhost/127.0.0.1:2181
2013-05-17 19:17:13 ClientCnxn [INFO] EventThread shut down
2013-05-17 19:17:13 ClientCnxn [INFO] Socket connection established to 
localhost/127.0.0.1:2181, initiating session
2013-05-17 19:17:13 ClientCnxn [INFO] Session establishment complete on 
server localhost/127.0.0.1:2181, sessionid = 0x13eb48fd6420001, negotiated 
timeout = 20000
2013-05-17 19:17:13 nimbus [INFO] Starting Nimbus server...
2013-05-17 19:22:04 TNonblockingServer [WARN] Got an IOException in 
internalRead!
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:202)
at sun.nio.ch.IOUtil.read(IOUtil.java:175)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:243)
at 
org.apache.thrift7.transport.TNonblockingSocket.read(TNonblockingSocket.java:141)
at 
org.apache.thrift7.server.TNonblockingServer$FrameBuffer.internalRead(TNonblockingServer.java:669)
at 
org.apache.thrift7.server.TNonblockingServer$FrameBuffer.read(TNonblockingServer.java:458)
at 
org.apache.thrift7.server.TNonblockingServer$SelectThread.handleRead(TNonblockingServer.java:359)
at 
org.apache.thrift7.server.TNonblockingServer$SelectThread.select(TNonblockingServer.java:304)
at 
org.apache.thrift7.server.TNonblockingServer$SelectThread.run(TNonblockingServer.java:243)

I have ensured that even the port 6627 is left unblocked. I have placed the 
storm.yaml file inside storm/conf folder as well inside .storm folder in my 
home root folder with al permissions.

Help appreciated.


Cheers

&lt;/pre&gt;</description>
    <dc:creator>Sanjai Nair</dc:creator>
    <dc:date>2013-05-17T18:55:05</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9618">
    <title>Initializing a tuple with data based on its fields grouping</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9618</link>
    <description>&lt;pre&gt;I have a bolt that counts incoming tuples grouped by field and periodically 
(via a tick tuple) emits those counts. I need this bolt to emit counts for 
all of the possible values that it might receive via its fields grouping 
which also are in our database. So the bolt must be initialized with 
counters for all of the tuples it might receive. The trick is knowing which 
tuples each bolt might receive since it uses a fields grouping. One 
solution is to have another bolt upstream that sends these initialization 
tuples to my bolt. But if my bolt is restarted then these counters will be 
lost. Are there any other solutions for this?

Thanks,
Josh

&lt;/pre&gt;</description>
    <dc:creator>pacesysjosh-Re5JQEeQqe8AvxtiuMwx3w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2013-05-17T16:58:49</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9617">
    <title>buffer size question</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9617</link>
    <description>&lt;pre&gt;Hi 

I was wondering how storm behaves in respect to 
topology.executor.receive.buffer.size: 8 #batched
and
topology.executor.send.buffer.size: 8 #individual messages

will the executor wait until it has 8 messages before it sends a batched 
message?.
if yes what happens if there are only 2 messages for a long time? will the 
tuples just time out? or be sent after a preconfigured time?

if i have few tuples that have a long processing time in each bolt, would 
it be better to set the buffer sizes to 1?


TIA

&lt;/pre&gt;</description>
    <dc:creator>anahap</dc:creator>
    <dc:date>2013-05-17T14:14:14</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9602">
    <title>What is the impact of TridentState on Scalability?</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9602</link>
    <description>&lt;pre&gt;
Can someone check me here?

It is my understanding that Trident batches up the tuples.  Then, batches
are sequenced.  

From the wiki:
"State updates are ordered among batches. That is, the state updates for
batch 3 won't be applied until the state updates for batch 2 have
succeeded."

Does this mean that only one TridentState instance is active in the topology
at a time? 
(e.g. Effectively, wouldn't that mean that calls to commit() are single
threaded)

Won't that mean that throughput of the topology is limited to the write
throughput of a single host/instance?
(I can write the batch contents in parallel within a single node, but that
seems to defeat horizontal scalability)

What am I missing?

-brian

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 &amp;lt; at &amp;gt;boneill42 &amp;lt;http://www.twitter.com/boneill42&amp;gt;   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


&lt;/pre&gt;</description>
    <dc:creator>Brian O'Neill</dc:creator>
    <dc:date>2013-05-16T20:50:43</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9596">
    <title>Multiple bolts listening to single bolt; some messages sent twice to one and 0 times to other</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9596</link>
    <description>&lt;pre&gt;I have a topology where there are two bolts ('a' &amp;amp; 'b') listening to a 
'splitter' bolt. The splitter bolt receives a large hunk of data and emits 
subsets of the data.

According to stormUI, the 'splitter' bolt is emitting 360 tuples. However, 
according to stormUI, bolt 'a' emitted 340 tuples and bolt 'b' emitted 380 
tuples as a result of doing their work.

Which leads me to the conclusion that somehow storm delivered 20 of the 
tuples to bolt 'b' twice and that bolt 'a' did not receive them at all.

I think, to handle this, I could move the splitting into the spout and then 
ack the tuples as they get to places downstream. It doesn't look like there 
is any way to track them otherwise (disclaimer: I have not looked at 
trident). 

Is there an easier way of making sure that a subscriber gets all tuples 
emitted?




&lt;/pre&gt;</description>
    <dc:creator>bill robertson</dc:creator>
    <dc:date>2013-05-16T14:48:25</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9594">
    <title>Is there a way debug storm clojure code?</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9594</link>
    <description>&lt;pre&gt;Hi all
 
In other programming language,  there is a lot of debug tool, such as gdb.
 
Is there a good way to debug code in Storm in cluster mode?  How to make 
storm output a dump file when it fail?
 
 
Thanks
-Tong

&lt;/pre&gt;</description>
    <dc:creator>Tong Jin</dc:creator>
    <dc:date>2013-05-16T09:22:08</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9593">
    <title>How to push data from external port to strom</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9593</link>
    <description>&lt;pre&gt;Hi Guys,

One Ip port is sending data .i want to read data via storm and store in to 
Nosql (cassandra) .

How to listen data from extenal port is there any function available in 
storm 

https://github.com/nathanmarz/storm-contrib. 
&amp;lt;https://github.com/nathanmarz/storm-contrib&amp;gt;

there is no program available the above link.




&lt;/pre&gt;</description>
    <dc:creator>kanna dhasan</dc:creator>
    <dc:date>2013-05-16T06:13:44</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9590">
    <title>How to build Petrel when PyYAML.org is down?</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9590</link>
    <description>&lt;pre&gt;I'm trying to build petrel by following along with the instructions on the 
wiki, but it's trying to get a dependency from PyYAML.org and that site is 
down.  I checked twitter and it seems that this site goes down on occasion. 
 Is there an alternate way to figure out what the dependency is and install 
it before I try to install Petrel?

Thanks in advance!

&lt;/pre&gt;</description>
    <dc:creator>John Hofmann</dc:creator>
    <dc:date>2013-05-16T01:00:08</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.comp.java.clojure.storm/9587">
    <title>throwing failedexception halts the processing on stream</title>
    <link>http://comments.gmane.org/gmane.comp.java.clojure.storm/9587</link>
    <description>&lt;pre&gt;I am using trident topology, it is using a  

FixedBatchSpout and the stream is persisted using partitionPersist. I want 
to test the replay scenario, to do this the stateUpdater is throwing a 
 failedexception and i expect the batch to be retried but the processing 
for the stream is halted. Any ideas?


TIA,

Sunil. 

&lt;/pre&gt;</description>
    <dc:creator>Sunil Yarram</dc:creator>
    <dc:date>2013-05-15T21:46:39</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.java.clojure.storm">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.java.clojure.storm</link>
  </textinput>
</rdf:RDF>
