<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel">
    <title>gmane.comp.java.hadoop.pig.devel</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31547"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31546"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31545"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31544"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31543"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31542"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31541"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31540"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31539"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31538"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31537"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31536"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31535"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31534"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31533"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31532"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31531"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31530"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31529"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31528"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31547">
    <title>[Commented] (PIG-3288) Kill jobs if the number of output files is over a configurable limit</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31547</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13687624#comment-13687624 ] 

Aniket Mokashi commented on PIG-3288:
-------------------------------------

Implementation is generic that the counter need not count number of files. It can really count arbitrary metrics and kill the job if exceeded. Should we rename the counter from "pig.exec.created.files.max.limit" to something else?
Also, in the storefunc, you are relying on the fact that for each new file storefunc is reinitialize in a new object. Is it a guaranteed behavior?
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-19T05:25:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31546">
    <title>[Commented] (PIG-3357) Pig doesn't take care of declared float type and converts it to double</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31546</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13687481#comment-13687481 ] 

Cheolsoo Park commented on PIG-3357:
------------------------------------

This is because Pig always converts Python float to Java double.
{code:title=JythonUtils.java}
} else if (pyObject instanceof PyFloat) {
   // J(P)ython is loosely typed, supports only float type, 
   // hence we convert everything to double to save precision
   javaObj = pyObject.__tojava__(Double.class);
}
{code}
In fact, that is a good thing because Python floating point numbers are implemented using double in C:
http://docs.python.org/2/library/stdtypes.html#typesnumeric

One workaround is to specify bytearray as the outputSchema of Python udf and do a cast to float in Pig. That is,
{code:title=test.py}
&amp;lt; at &amp;gt;outputSchema("v: bytearray")
def test(v):
   return v
{code}
{code:title=test.pig}
table_out = foreach table_in generate (float)udf.test(v);
{code}
Perhaps we sh&lt;/pre&gt;</description>
    <dc:creator>Cheolsoo Park (JIRA</dc:creator>
    <dc:date>2013-06-19T01:14:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31545">
    <title>Subscription: PIG patch available</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31545</link>
    <description>&lt;pre&gt;Issue Subscription
Filter: PIG patch available (15 issues)

Subscriber: pigdaily
        
Key         Summary
PIG-3346    New property that controls the number of combined splits
            https://issues.apache.org/jira/browse/PIG-3346
PIG-3333    Fix remaining Windows core unit test failures
            https://issues.apache.org/jira/browse/PIG-3333
PIG-3295    Casting from bytearray failing after Union (even when each field is from a single Loader)
            https://issues.apache.org/jira/browse/PIG-3295
PIG-3292    Logical plan invalid state: duplicate uid in schema during self-join to get cross product
            https://issues.apache.org/jira/browse/PIG-3292
PIG-3288    Kill jobs if the number of output files is over a configurable limit
            https://issues.apache.org/jira/browse/PIG-3288
PIG-3257    Add unique identifier UDF
            https://issues.apache.org/jira/browse/PIG-3257
PIG-3247    Piggybank functions to mimic OVER clause in SQL
            https://issues.apache.org/jira/browse&lt;/pre&gt;</description>
    <dc:creator>jira-1oDqGaOF3Lkdnm+yROfE0A&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2013-06-19T01:01:02</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31544">
    <title>Fwd: IMPORTANT: Major Confluence Upgrade Coming Soon. Please review test instance now.</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31544</link>
    <description>&lt;pre&gt;

Julien

Begin forwarded message:

&lt;/pre&gt;</description>
    <dc:creator>Julien Le Dem</dc:creator>
    <dc:date>2013-06-18T19:35:24</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31543">
    <title>[Updated] (PIG-3355) ColumnMapKeyPrune bug with distinct operator</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31543</link>
    <description>&lt;pre&gt;
     [ https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aniket Mokashi updated PIG-3355:
--------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)
    

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T17:38:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31542">
    <title>[Commented] (PIG-3355) ColumnMapKeyPrune bug with distinct operator</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31542</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686997#comment-13686997 ] 

Aniket Mokashi commented on PIG-3355:
-------------------------------------

[~knoguchi], Feel free to rebase and submit a patch for 0.11.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T17:38:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31541">
    <title>[Commented] (PIG-3355) ColumnMapKeyPrune bug with distinct operator</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31541</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686815#comment-13686815 ] 

Koji Noguchi commented on PIG-3355:
-----------------------------------

bq. Committed to trunk. Thanks Jeremy!

[~aniket486], status is still "Patch Available"?  
Also, can we patch 0.11 as well so that it'll be included if we release another 0.11.* ?
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Koji Noguchi (JIRA</dc:creator>
    <dc:date>2013-06-18T15:11:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31540">
    <title>[Updated] (PIG-3357) Pig doesn't take care of declared float type and converts it to double</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31540</link>
    <description>&lt;pre&gt;
     [ https://issues.apache.org/jira/browse/PIG-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey updated PIG-3357:
------------------------

    Summary: Pig doesn't take care of declared float type and converts it to double  (was: Pig doesn't take case of declared float and converts it to double)
    

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Sergey (JIRA</dc:creator>
    <dc:date>2013-06-18T09:45:19</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31539">
    <title>[Created] (PIG-3357) Pig doesn't take case of declared float and converts it to double</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31539</link>
    <description>&lt;pre&gt;Sergey created PIG-3357:
---------------------------

             Summary: Pig doesn't take case of declared float and converts it to double
                 Key: PIG-3357
                 URL: https://issues.apache.org/jira/browse/PIG-3357
             Project: Pig
          Issue Type: Bug
          Components: piggybank
    Affects Versions: 0.11
         Environment: cdh 4.3.0
            Reporter: Sergey


Here is the script:
{code}
register /usr/lib/pig/lib/avro-1.7.4.jar;
register /usr/lib/pig/lib/json-simple-1.1.jar;
register /usr/lib/pig/piggybank.jar;

register test.py using jython as udf;

table_in = load 'in' as (v: float);
table_out = foreach table_in generate udf.test(v);
store table_out into 'out' using org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '{"name": "test", "type": "float"}');
{code}

Here is UDF:
{code=python}
&amp;lt; at &amp;gt;outputSchema("v: float")
def test(v):
  return v
{code}

Here is an input:
{code}
1
{code}

Here is the stacktrace:
java.lang.Exception: org.apache.avro.file.Da&lt;/pre&gt;</description>
    <dc:creator>Sergey (JIRA</dc:creator>
    <dc:date>2013-06-18T09:43:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31538">
    <title>[Updated] (PIG-3357) Pig doesn't take case of declared float and converts it to double</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31538</link>
    <description>&lt;pre&gt;
     [ https://issues.apache.org/jira/browse/PIG-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey updated PIG-3357:
------------------------

    Description: 
Here is the script:
{code}
register /usr/lib/pig/lib/avro-1.7.4.jar;
register /usr/lib/pig/lib/json-simple-1.1.jar;
register /usr/lib/pig/piggybank.jar;

register test.py using jython as udf;

table_in = load 'in' as (v: float);
table_out = foreach table_in generate udf.test(v);
store table_out into 'out' using org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '{"name": "test", "type": "float"}');
{code}

Here is UDF:
{code}
&amp;lt; at &amp;gt;outputSchema("v: float")
def test(v):
  return v
{code}

Here is an input:
{code}
1
{code}

Here is the stacktrace:
java.lang.Exception: org.apache.avro.file.DataFileWriter$AppendWriteException: java.io.IOException: Cannot convert to float:class java.lang.Double
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:404)
Caused by: org.apache.avro.file.DataFileWri&lt;/pre&gt;</description>
    <dc:creator>Sergey (JIRA</dc:creator>
    <dc:date>2013-06-18T09:43:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31537">
    <title>[Updated] (PIG-2244) Macros cannot be passed relation names</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31537</link>
    <description>&lt;pre&gt;
     [ https://issues.apache.org/jira/browse/PIG-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aniket Mokashi updated PIG-2244:
--------------------------------

    Status: Open  (was: Patch Available)
    

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T05:47:22</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31536">
    <title>[Commented] (PIG-2244) Macros cannot be passed relation names</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31536</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686412#comment-13686412 ] 

Aniket Mokashi commented on PIG-2244:
-------------------------------------

Canceling patch.. Please re-submit when its ready.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T05:47:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31535">
    <title>[Commented] (PIG-3355) ColumnMapKeyPrune bug with distinct operator</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31535</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686411#comment-13686411 ] 

Aniket Mokashi commented on PIG-3355:
-------------------------------------

Committed to trunk. Thanks Jeremy!
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T05:45:24</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31534">
    <title>[Assigned] (PIG-3355) ColumnMapKeyPrune bug with distinct operator</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31534</link>
    <description>&lt;pre&gt;
     [ https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Aniket Mokashi reassigned PIG-3355:
-----------------------------------

    Assignee: Jeremy Karn
    

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T04:50:23</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31533">
    <title>[Commented] (PIG-3355) ColumnMapKeyPrune bug with distinct operator</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31533</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686389#comment-13686389 ] 

Aniket Mokashi commented on PIG-3355:
-------------------------------------

+1. Will commit if ant test-commit passes.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Aniket Mokashi (JIRA</dc:creator>
    <dc:date>2013-06-18T04:48:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31532">
    <title>Subscription: PIG patch available</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31532</link>
    <description>&lt;pre&gt;Issue Subscription
Filter: PIG patch available (17 issues)

Subscriber: pigdaily
        
Key         Summary
PIG-3355    ColumnMapKeyPrune bug with distinct operator
            https://issues.apache.org/jira/browse/PIG-3355
PIG-3346    New property that controls the number of combined splits
            https://issues.apache.org/jira/browse/PIG-3346
PIG-3333    Fix remaining Windows core unit test failures
            https://issues.apache.org/jira/browse/PIG-3333
PIG-3295    Casting from bytearray failing after Union (even when each field is from a single Loader)
            https://issues.apache.org/jira/browse/PIG-3295
PIG-3292    Logical plan invalid state: duplicate uid in schema during self-join to get cross product
            https://issues.apache.org/jira/browse/PIG-3292
PIG-3288    Kill jobs if the number of output files is over a configurable limit
            https://issues.apache.org/jira/browse/PIG-3288
PIG-3257    Add unique identifier UDF
            https://issues.apache.org/jira/browse/PI&lt;/pre&gt;</description>
    <dc:creator>jira-1oDqGaOF3Lkdnm+yROfE0A&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2013-06-18T01:02:01</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31531">
    <title>[Commented] (PIG-2764) Add a biginteger and bigdecimal type to pig</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31531</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-2764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686249#comment-13686249 ] 

Rohini Palaniswamy commented on PIG-2764:
-----------------------------------------

bq. Actually I am going to backport this feature to pig-0.11.
  I am assuming you are doing this for a internal version of 0.11 that you maintain in git. Since this is a feature, it cannot be ported to apache pig 0.11 branch. Only critical and major bugs can go into the branch.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Rohini Palaniswamy (JIRA</dc:creator>
    <dc:date>2013-06-18T00:49:21</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31530">
    <title>[Commented] (PIG-3325) Adding a tuple to a bag is slow</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31530</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686142#comment-13686142 ] 

Mark Wagner commented on PIG-3325:
----------------------------------

Cool. Thanks, Rohini!

Dmitriy, what was your LoadFunc? I found that if the LoadFunc creates bags with an initial list of elements, then the slowness doesn't show up until later when new bags are created (during a DISTINCT in my case).
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Mark Wagner (JIRA</dc:creator>
    <dc:date>2013-06-17T23:07:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31529">
    <title>[Commented] (PIG-3325) Adding a tuple to a bag is slow</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31529</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686114#comment-13686114 ] 

Rohini Palaniswamy commented on PIG-3325:
-----------------------------------------

Mark,
  I already have a patch that does initialization of memory sizes only once and removes the markSpillableIfNecessary during addTuple. Will put it up by tomorrow after running some e2e tests for bag spilling. 

Dmitriy,
  Moving the getMemorySize out of the compare method should give a significant gain for the case that you were seeing. I will post some numbers after running some tests.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Rohini Palaniswamy (JIRA</dc:creator>
    <dc:date>2013-06-17T22:45:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31528">
    <title>[Commented] (PIG-3325) Adding a tuple to a bag is slow</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31528</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13686100#comment-13686100 ] 

Dmitriy V. Ryaboy commented on PIG-3325:
----------------------------------------

The previous behavior (having SMM check all bags) was pretty bad, it caused significant sudden delays if the data you were loading had bags in it. We observed pretty good speed gains for those use cases once we got rid of mandatory bag registration. Also got rid of a few memory leaks while we were in there, and the linked list maintenance overhead in SMM.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Dmitriy V. Ryaboy (JIRA</dc:creator>
    <dc:date>2013-06-17T22:32:20</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31527">
    <title>[Commented] (PIG-3325) Adding a tuple to a bag is slow</title>
    <link>http://permalink.gmane.org/gmane.comp.java.hadoop.pig.devel/31527</link>
    <description>&lt;pre&gt;
    [ https://issues.apache.org/jira/browse/PIG-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;amp;focusedCommentId=13685881#comment-13685881 ] 

Mark Wagner commented on PIG-3325:
----------------------------------

Thanks for taking a look, Dmitriy. I agree that doing work in add() is the wrong way to go. I don't think there's a way to get the time back down to 400 ns while still having lazy registration, but that may be okay if it prevents bad behavior elsewhere.

I'll try out caching the memory sizes during sorting and see how things improve. That should improve performance no matter how 'spillables' gets populated.
                

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

&lt;/pre&gt;</description>
    <dc:creator>Mark Wagner (JIRA</dc:creator>
    <dc:date>2013-06-17T19:27:20</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.java.hadoop.pig.devel">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.java.hadoop.pig.devel</link>
  </textinput>
</rdf:RDF>
