<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel rdf:about="http://permalink.gmane.org/gmane.comp.python.xml">
    <title>gmane.comp.python.xml</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4573"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4572"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4571"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4570"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4569"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4568"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4567"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4566"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4565"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4564"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4563"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4562"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4561"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4560"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4559"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4558"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4557"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4556"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4555"/>
        <rdf:li rdf:resource="http://permalink.gmane.org/gmane.comp.python.xml/4554"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4573">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4573</link>
    <description>&lt;pre&gt;Thank you everyone for the excellent replies.

As someone noticed, my original complaint was that the parser was
returning linefeeds at all in the DOM tree. I thought that the Windows
cr/lf format was causing this but  now understand that this is what it
is supposed to do.

I received conflicting advice on whether to process the XML files as
binary or text but that is a topic for a different thread.

Wayne 



----Notice Regarding Confidentiality----
This email, including any and all attachments, (this "Email") is intended only for the party to whom it is addressed and may contain information that is confidential or privileged.  Sierra Systems Group Inc. and its affiliates accept no responsibility for any loss or damage suffered by any person resulting from any unauthorized use of or reliance upon this Email.  If you are not the intended recipient, you are hereby notified that any dissemination, copying or other use of this Email is prohibited.  Please notify us of the error in communication by return email &lt;/pre&gt;</description>
    <dc:creator>Peterson, Wayne</dc:creator>
    <dc:date>2010-05-12T13:46:39</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4572">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4572</link>
    <description>&lt;pre&gt;"Martin v. Löwis" wrote at 2010-5-11 09:14 +0200:

I may have misunderstood the original problem report.
I have read it as: I see "\r\n" text nodes.



--
Dieter
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Dieter Maurer</dc:creator>
    <dc:date>2010-05-11T07:42:25</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4571">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4571</link>
    <description>&lt;pre&gt;
Why do you say that? It expects them just fine, replacing them with \n
line endings, then inserting those into the DOM tree. Just as it should.
I believe the OP was complaining that it creates those text nodes in
the first place, not that it does or does not specifically do that for
\r\n line endings.

Regards,
Martin
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Martin v. Löwis</dc:creator>
    <dc:date>2010-05-11T07:14:59</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4570">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4570</link>
    <description>&lt;pre&gt;Bill Kinnersley, 10.05.2010 19:59:

Sorry, but the only sane way to read them is as binary data. Passing 
unicode text to the parser will interfere with the encoding declaration at 
the beginning.



Interesting. I wasn't aware of that, but it's true.

http://www.w3.org/TR/REC-xml/#sec-line-ends

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Stefan Behnel</dc:creator>
    <dc:date>2010-05-11T06:16:13</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4569">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4569</link>
    <description>&lt;pre&gt;
XML files contain encoded text, and must be handled as binary files.


  -Fred

&lt;/pre&gt;</description>
    <dc:creator>Fred Drake</dc:creator>
    <dc:date>2010-05-10T18:58:55</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4568">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4568</link>
    <description>&lt;pre&gt;
Wayne,

It sounds to me like you're doing everything correctly.

- XML files are text files, and should be read as text.

- In the absence of a DTD, all whitespace is regarded as significant. 
Typically this means yes, there will be a text node between consecutive 
element nodes.

- The XML processor is required to return end-of-line as a single '\n', 
regardless of which OS or programming language.

If you are traversing every node, you'll need to explicitly ignore the 
text nodes. More usually you don't have to deal with them, because you 
know what nodes you're looking for and pick them out with 
GetElementsByTagName.


_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Bill Kinnersley</dc:creator>
    <dc:date>2010-05-10T17:59:17</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4567">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4567</link>
    <description>&lt;pre&gt;That's what I thought as well. I was expecting the parser to ignore all
forms of linefeed.

I believe I am accessing my files as text files. The documentation for
minidom.parse says you can pass it a file name or a file object and I
have tried it both ways with the same result. Here is the open statement
I am using.

infile = open(in_path_file, 'r')
in_xmldoc = minidom.parse(infile)

The input file contains cr/lf linefeeds x'0a0d'.

When I do something like,

surveys = form.childNodes

the surveys.firstChild node will contain x'0a' which I have to ignore.

Wayne  

-----Original Message-----
From: Dieter Maurer [mailto:dieter&amp;lt; at &amp;gt;handshake.de] 
Sent: Sunday, May 09, 2010 11:50 PM
To: Peterson, Wayne
Cc: xml-sig&amp;lt; at &amp;gt;python.org
Subject: Re: [XML-SIG] Parsing XML file with Minidom has problem with
cr/lf

Peterson, Wayne wrote at 2010-5-8 23:43 -0700:
It

The parser should not see these "cr/lf" characters at all.

Python strings itself use only "\n" (aka "lf") to delimite lines.
The "\r" (aka "cr") should only be introd&lt;/pre&gt;</description>
    <dc:creator>Peterson, Wayne</dc:creator>
    <dc:date>2010-05-10T14:04:05</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4566">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4566</link>
    <description>&lt;pre&gt;Dieter Maurer, 10.05.2010 09:07:

Interesting. Then this might really be a bug. There was a change in Python 
2.6.5 that broke universal newline handling for the codecs module, this 
might hit here.

However, according to what the OP described, the cr/lf characters turn up 
correctly now, so ISTM that it's the plain '\n' line ending that needs fixing.

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Stefan Behnel</dc:creator>
    <dc:date>2010-05-10T07:43:25</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4565">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4565</link>
    <description>&lt;pre&gt;Stefan Behnel wrote at 2010-5-10 08:57 +0200:

Why do you think so?

The default "minidom" parser seems not to expect "\r\n" line endings....



--
Dieter
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Dieter Maurer</dc:creator>
    <dc:date>2010-05-10T07:07:55</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4564">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4564</link>
    <description>&lt;pre&gt;Dieter Maurer, 10.05.2010 07:50:

The correct way to parse XML files is as binary data.

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Stefan Behnel</dc:creator>
    <dc:date>2010-05-10T06:57:43</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4563">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4563</link>
    <description>&lt;pre&gt;Peterson, Wayne wrote at 2010-5-8 23:43 -0700:

The parser should not see these "cr/lf" characters at all.

Python strings itself use only "\n" (aka "lf") to delimite lines.
The "\r" (aka "cr") should only be introduced when those lines
are written to text files. And they should be removed when
those line are read in again.

Are you sure that you access your files as "text" files?



--
Dieter
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Dieter Maurer</dc:creator>
    <dc:date>2010-05-10T05:50:03</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4562">
    <title>Re: Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4562</link>
    <description>&lt;pre&gt;Peterson, Wayne, 09.05.2010 08:43:

Whitespace is significant in the W3C DOM, so minidom must provide it in the 
DOM tree. It doesn't "have problems" because it creates text nodes for 
them, that's just the way things work.

Note that the xml.etree.ElementTree package tends to be a lot more user 
friendly for XML handling than the minidom package, simply because if 
focuses on the XML Infoset and moves text out of the way when dealing with 
elements.

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Stefan Behnel</dc:creator>
    <dc:date>2010-05-09T17:27:05</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4561">
    <title>Parsing XML file with Minidom has problem with cr/lf</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4561</link>
    <description>&lt;pre&gt;_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig
&lt;/pre&gt;</description>
    <dc:creator>Peterson, Wayne</dc:creator>
    <dc:date>2010-05-09T06:43:09</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4560">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4560</link>
    <description>&lt;pre&gt;
Thanks all for the help. This gives me alot of good options and I have a few
working.... I learned a lot!





kimmyaf wrote:

&lt;/pre&gt;</description>
    <dc:creator>kimmyaf</dc:creator>
    <dc:date>2010-04-28T21:37:13</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4559">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4559</link>
    <description>&lt;pre&gt;kimmyaf, 27.04.2010 23:32:

 &amp;gt;&amp;gt; Do I have to use a file? I tried to do
 &amp;gt;&amp;gt;
 &amp;gt;&amp;gt; tree = ET.parse(xml_response)

parse() is meant for parsing files. Use fromstring() to parse from a string.

This works for me:

   &amp;gt;&amp;gt;&amp;gt; import xml.etree.cElementTree as ET
   &amp;gt;&amp;gt;&amp;gt; tree = ET.parse('gmap.xml')
   &amp;gt;&amp;gt;&amp;gt; print [ (el.findtext('lat'), el.findtext('lng'))
   ...         for el in tree.getiterator('location') ]
   [('42.3118520', '-71.2632680')]

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Stefan Behnel</dc:creator>
    <dc:date>2010-04-28T05:43:32</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4558">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4558</link>
    <description>&lt;pre&gt;
And indeed, they do change their schemas without real concern backward
compatibility.  The sitemaps are in the middle of changing even now.


  -Fred

&lt;/pre&gt;</description>
    <dc:creator>Fred Drake</dc:creator>
    <dc:date>2010-04-28T05:09:59</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4557">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4557</link>
    <description>&lt;pre&gt;_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig
&lt;/pre&gt;</description>
    <dc:creator>Peter Bigot</dc:creator>
    <dc:date>2010-04-27T23:37:33</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4556">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4556</link>
    <description>&lt;pre&gt;2010/4/27 kimmyaf &amp;lt;flahertyk1&amp;lt; at &amp;gt;hotmail.com&amp;gt;:

I prefer amara:

...     print loc.lat, loc.lng
...
42.3118520 -71.2632680

;)

--lm

_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Luis Miguel Morillas</dc:creator>
    <dc:date>2010-04-27T22:34:54</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4555">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4555</link>
    <description>&lt;pre&gt;
Now that I look at my file it does not look well formed. Do I have to use a
file? I tried to do

tree = ET.parse(xml_response)

but i got a file IO error...




kimmyaf wrote:

&lt;/pre&gt;</description>
    <dc:creator>kimmyaf</dc:creator>
    <dc:date>2010-04-27T21:32:39</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4554">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4554</link>
    <description>&lt;pre&gt;
I don't really know... Here's the whole story.

I am retrieving the xml by calling this link.

http://maps.google.com/maps/api/geocode/xml?address=50+Oakland+St,Wellesley,MA,02481&amp;amp;sensor=true



Here's the entire function:

addr = '50+Oakland+St,Wellesley,MA,02481'

def geocode_addr(addr):
    hostname =  'http://maps.google.com/maps/api/geocode/xml?'
    prefix = 'address='
    sensor = '&amp;amp;sensor=true'
    url = hostname + prefix + addr + sensor
    
    print url
    
    handler = urllib2.urlopen(url)
            
    xml_response = handler.read()    
    print xml_response
    #dom = minidom.parseString(xml_response)    
    handler.close()
    
    tree = ET.parse("GeocodeResponse.xml")
    print 'here'
    for tag in tree.getiterator("location"):
        print 'here1'
        print tag.findtext("lat")
        tag.findtext("lng")


*** I actually just pasted the xml from the shell where i printed
xml_response and saved it into an xml file in my folder called
GeocodeResponse.xml to test this... before go&lt;/pre&gt;</description>
    <dc:creator>kimmyaf</dc:creator>
    <dc:date>2010-04-27T21:30:39</dc:date>
  </item>
  <item rdf:about="http://permalink.gmane.org/gmane.comp.python.xml/4553">
    <title>Re: parsing XML with minidom</title>
    <link>http://permalink.gmane.org/gmane.comp.python.xml/4553</link>
    <description>&lt;pre&gt;kimmyaf, 26.04.2010 23:14:

Maybe the document uses namespace declarations that you forgot to show us?

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG&amp;lt; at &amp;gt;python.org
http://mail.python.org/mailman/listinfo/xml-sig

&lt;/pre&gt;</description>
    <dc:creator>Stefan Behnel</dc:creator>
    <dc:date>2010-04-27T11:35:32</dc:date>
  </item>
  <textinput rdf:about="http://search.gmane.org/?group=$group=gmane.comp.python.xml">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.comp.python.xml</link>
  </textinput>
</rdf:RDF>
