<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/">
  <channel about="http://blog.gmane.org/gmane.network.wwwoffle.user">
    <title>gmane.network.wwwoffle.user</title>
    <link>http://blog.gmane.org/gmane.network.wwwoffle.user</link>
    <description/>
    <syn:updatePeriod>hourly</syn:updatePeriod>
    <syn:updateFrequency>1</syn:updateFrequency>
    <syn:updateBase>1901-01-01T00:00+00:00</syn:updateBase>
    <items>
      <rdf:Seq>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1404"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1397"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1395"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1391"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1384"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1383"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1381"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1379"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1376"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1373"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1371"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1367"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1358"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1353"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1349"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1347"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1346"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1336"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1332"/>
        <rdf:li rdf:resource="http://comments.gmane.org/gmane.network.wwwoffle.user/1331"/>
      </rdf:Seq>
    </items>
    <image rdf:resource="http://gmane.org/img/gmane-25t.png"/>
    <textinput rdf:resource=""/>
  </channel>
  <image rdf:about="http://gmane.org/img/gmane-25t.png">
    <title>Gmane</title>
    <url>http://gmane.org/img/gmane-25t.png</url>
    <link>http://gmane.org</link>
  </image>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1404">
    <title>www.consortiuminfo.org/standardsblog/ blockswwwoffle?</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1404</link>
    <description>hi,

is anyone here able to access www.consortiuminfo.org/standardsblog/ through
wwwoffle?  i keep getting this message below, and i just don't get why they
would block a proxy, and also how they detect something anyways. 

what does wwwoffle do to the request? is it possible to make wwwoffle send out
the request exactly as it received it from the browser?

----------------- error message: ---------------------
Precondition Failed

We're sorry, but we could not fulfill your request for /standardsblog/article.php?story=20080708052706429 on this server.

We have established rules for access to this server, and any person or robot that violates these rules will be unable to access this site.

To resolve this problem, please try the following steps:

    * Ensure that your computer is free of viruses, Trojan horses, spyware or any other sort of malicious software.
    * If you are using any sort of personal firewall or browser privacy software, check to ensure that its settings do not cause your web browser to inadvertently violate any of the rules listed below.
    * If you are behind a Web proxy or corporate firewall, the proxy must conform to the HTTP specification with respect to proxy servers. Contact your network administrator if the trouble persists, or bypass the proxy and connect directly if possible.
    * Disable any download accelerators you may be using. They don't speed up your downloads anyway; in most cases, they actually run slower!
    * If all else fails, try using a different Web browser, such as Firefox.

If you still need assistance, please contact updegrove at consortiuminfo.org.
More Information

For your reference, the conditions for access to this server are:
Robots:

    * MUST read and obey robots.txt.
    * MUST identify themselves properly; for example MUST NOT identify as Mozilla.
    * MUST NOT pretend to be a human.

Humans:

    * MUST NOT pretend to be a robot.
    * MUST NOT use a computer infected with viruses, Trojan horses or other malicious software.

Both:

    * MUST NOT harvest email addresses.
    * MUST NOT attempt to send spam.
    * MUST NOT attempt to compromise server security.
    * MUST NOT use excessive amounts of bandwidth or other server resources.

The precondition on the request for the URL /standardsblog/article.php?story=20080708052706429 evaluated to false.
&lt;ADDRESS&gt;Apache/1.3.36 Server at www.consortiuminfo.org Port 80&lt;/ADDRESS&gt;
------------------------end --------------------------

greetings, martin.
</description>
    <dc:creator>Martin Bähr</dc:creator>
    <dc:date>2008-07-09T11:36:13</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1397">
    <title>DontCache still ends up in outgoing</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1397</link>
    <description>Complaint: offline both
$ wwwoffle SomeDontGetURL
$ wwwoffle SomeDontCacheURL
both say
Requesting ThatURL
and return $?=0 to the shell,
even though WWWOFFLE intends to do no such fetching.

At least one can do
  # grep 'not to get' /var/log/syslog
  wwwoffles[5218]: The URL 'http://example.net/f.jpg' matches one in the
  list not to get.
to know about the former, but what about the latter?
The latter still ends up in http://localhost:8080/index/outgoing/
Clicking on it there, still here offline, says
  Your request for URL
  http://en.wikipedia.org/w/index.php?title=List_of_thinking_errors&amp;action=edit
  failed because it is on the list of hosts and/or paths that are not to
  be cached and cannot be requested when offline.
Well, OK, then it should be barred from ending up in outgoing too.

OK, to check for the latter one would do, after fetching,
# less +/not\ possible /var/log/syslog
(note I use maximum debug level for my messages)

Anyway, if the shell returned 1 and a message for both, one could much
easier tell which of one's command line requests one had betted make
other plans for (fetching by hand, as they are on our DontGet and
DontCache lists), rather that thinking WWWOFFLE will remember to fetch
them for us when indeed it has no such plans, and that will be last we
will see of them until six months later when we realize that somehow
we never read the Plurbitsky article or whatever that we had on our
reading list.

Anyway, it sure is tough to check in a batch job way what will be
forgotten. And there's no http://localhost:8080/index/lastfailures --
but one would rather know if something bad will happen right at
$ wwwoffle $URL||echo Holmes, make other plans for that $URL. It is on \
one of your Dont lists.

Just keeping the mailing list warm, here with latest Debian sid wwwoffle 2.9a-2.


</description>
    <dc:creator>jidanni-8D0D3YcSAvhAfugRpC6u6w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2008-03-01T18:11:30</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1395">
    <title>executives still use WWWOFFLE to keep track ofwhat's already read</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1395</link>
    <description>Gentlemen, assuming that one has "the works" in the latest computing
equipment and network connections, why would one still use WWWOFFLE?

Well, certainly one cannot keep track of what articles one has already
read on a sites with many articles. So with
Purge
{
 age=-1
}
one will always know that one has already read which articles by looking
at their link colors, (etc. as I mentioned in one of my previous postings.)


</description>
    <dc:creator>jidanni-8D0D3YcSAvhAfugRpC6u6w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2007-11-27T18:46:50</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1391">
    <title>404 with wwwoffle, page delivered without</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1391</link>
    <description>Hi,

when i visit a certain URL i get error 404 with wwwoffle and the page 
without a proxy.

Here is the URL:

http://www.instructables.com/id
/Bicyle-Power-for-Your-Television,-Laptop,-or-Cell-/?relatedLink

(note that i broke the URL to fit it in here, it should read
...id/Bi...)

Can anyone confirm and/or explain that? Here is my environment:

Server
- NetBSD 4.0_BETA2 i386
- wwwoffle version 2.9a

Client
- Debian 4.0r1 i386
- Mozilla Firefox 2.0.0.5

Thanks for your time.

MFG,

Karsten Kruse

</description>
    <dc:creator>Karsten Kruse</dc:creator>
    <dc:date>2007-10-06T18:28:43</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1384">
    <title>Not quite wwwoffle - for mobiles and low bandwidthuse.</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1384</link>
    <description>I've used wwwoffle for some years, and it 'just works' for me.

So, naturally, when I realised a need for a possibly related bit of 
software, and lacking results from google. I started wondering if anyone 
had thought of, or knows of an implementation or proof of concept with 
wwwoffled.

Basically, it's a two part web proxy to drastically reduce web usage 
bandwidth.
One part resides on a mobile device with a (usually) poor bandwidth 
link, but relatively large amount of storage, that may occasionally be 
plugged into a high speed network.

The other part is on server, connected via a fast connection to the 
internet.

To quote a page I wrote describing this.

-------------------------------------

"This is a brief page describing a web proxy optimised for use on 
devices with a reasonable amount of persistant storage, and very limited 
bandwidth.

Once, each page linked to a subpage of contents, which remained static, 
and could be easily refreshed if it changed based on dates in the HTTP 
headers.

Now, this is the case in the minority of popular sites. Most sites now 
have a substantial fraction of pages with some non-static content.

As an example of this, for example consider http://www.ebay.com/index.html.

Over a 15 minute period, the size was constant at around 66K, and it was 
different most times it was loaded.

Simply compressing this page using advanced compression techniques 
provides a useful compression - taking the page to 15K.

A very simple test, using diff and gzip however, revealed that the 
variation between pages is quite small.

This means that if the user clicks 'reload', if the proxy simply 
compresses the page, the user needs to download 15K.

If, however, the user-agent and the proxy act in concert, this can be 
reduced to under 0.5K. (split on "&lt;", count the compressed differences).

This is done by the user-agent caching the pages it downloads, then 
informing the proxy of which version of the page it has.

The proxy then simply sends the compressed differences between the 
previous and current version.

Other optimisations:

     * Comparing pages, and ensuring that any page has in fact changed 
before downloading, as many servers misreport pages changed when they 
have not.
     * Convert all jpegs to progressive, and initially only download the 
first 'scan' of the image, which is 1/8th the size or so. Allow the user 
to download the remainder of the file for full resolution by clicking on 
it. "

-------------------------

As I understand it, the easiest way to implement this (over wwwoffle) 
would be with a special protocol.

Basically, the user-side part sends:
* I want http://www.ebay.com/ and have the version timestamped 
1188733136. with the hash f879f6ff876f8f...

The server side sends:
* Here is the page content &lt;compressed page&gt;, you must be confused, I 
don't have that timestamp.

Or
* That page has not changed.

Or
* Here is the diff between the timestamped version and the page you 
requested &lt;compressed diff&gt; the hash of the whole page is &lt;sha-256&gt;.

User-side then checks the hash of the local part, after applying the 
diff, and if so, stores this and serves it to the local browser.

If it doesn't match, it requests the page without compression.

Obviously this is single user only.

To make it multiuser would require also storing pages in the form 
site/D598387453.HASH, and some complex expiry protocol to make sure that 
the same pages get expired per-user on the remote and local side.

Any thoughts?


</description>
    <dc:creator>Ian Stirling</dc:creator>
    <dc:date>2007-09-02T11:49:47</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1383">
    <title>Dan Jacobson does not fall for "SpamBLK"</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1383</link>
    <description>e&gt; To: exp-IjSeDpznLsezQB+pC5nmwQ&lt; at &gt;public.gmane.org
e&gt; Subject: [WWWOFFLE-Users] only-same-host-frames
e&gt; The person you tried to send this email to is using SpamBLK...
e&gt; If you don't click the link above your previous email will be deleted.

Wait, it must be a trap Dan! Don't click! It must be a spam-bot
subscribed to the list, hoping to harvest "confirmed live ones". Well,
I'll have him know that I just happen to be College Educated -- no
easy mark. Ha!


</description>
    <dc:creator>jidanni-8D0D3YcSAvhAfugRpC6u6w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2007-09-01T13:33:31</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1381">
    <title>only-same-host-frames</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1381</link>
    <description>Gentlemen, it's me again with another brilliant idea.
You know when those sites pull in those ad frames,
URL='http://news.com.com/Developing+nations+losing+spam+battle,+
Default Recursive Fetch options: stylesheets=0 images=0 frames=2
Frame=http://view.atdmt.com/M0N/iview/cntcmssc0770000080m0n/dire
Frame=http://view.atdmt.com/MRT/iview/cntnkinf0250006355mrt/dire

Well, there needs to be an only-same-host-frames variable, on the
model of only-same-host-images, for those of us that use frames=yes.


</description>
    <dc:creator>jidanni-8D0D3YcSAvhAfugRpC6u6w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2007-09-01T01:08:43</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1379">
    <title>SSL certificates</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1379</link>
    <description>Folks,

  I have a slightly different use-case for wwwoffle - I scoop websites
  on one machine with wwwoffle, tar up the files, and pass them to
  another machine via UUCP.

  It allows me to provide web services for disconnected networks.

  I am upgrading to the latest version of wwwoffle, and have bumped into
  a problem or two.

  First, using Ubuntu Feisty and wwwoffle 2.9a-2 the creation of root
  certificates is not reliable on startup - I often get an empty file.
  There seems to be some discussion of this on the list, and maybe I
  need a newer version.

  Second, the path to those certificates seems to be hardcoded into the
  binary, at /etc/wwwoffle/certificates.

  I don't really like this, as I create dynamic wwwoffle instances on
  the fly, in /var/tmp/wwwoffle2345/* as a different user (uucp) and now
  it tangles up with my 'upstream' wwwoffle instance on the same
  machine.

  First prize for me is a way to disable all the SSL stuff completely,
  as all this happens unattended so SSL is not that necessary.

  For the moment I now run the master wwwoffle as the uucp user (ugh)
  and all instances share the certificates.

  Next best would be to be able to configure the certificate path, so I
  can have it as a different owner. 'strings' on the binary (have not
  looked at the source yet) suggests ownership and permissions of the
  certificates is important.

  Ideas ?

Cheers,     Andy!


</description>
    <dc:creator>Andy Rabagliati</dc:creator>
    <dc:date>2007-08-19T14:43:55</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1376">
    <title>extra spaces being added</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1376</link>
    <description>While investigating
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=395009 (which is a
wierd bug...) I noticed that pages received via wwwoffle were larger
than directly received pages. Wwwoffle is adding spaces before /&gt;
strings.

To demonstrate:

# echo "&lt;bla /&gt;"|wwwoffle-write --addheader http://foo.bar/bla

# wwwoffle-read http://foo.bar/bla | cat -vet
HTTP/1.0 200 OK^M$
Content-Type: text/html^M$
^M$
&lt;bla /&gt;$

# http_proxy=http://localhost:8080/ GET http://foo.bar/bla | cat -vet
&lt;bla  /&gt;$


Now this shouldn't be a problem for properly written browsers (I've
checked that it doesn't happen with e.g. text/plain content), but I
suspect an off-by-one error somewhere...


Paul Slootman


</description>
    <dc:creator>Paul Slootman</dc:creator>
    <dc:date>2007-07-19T10:26:40</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1373">
    <title>is it possible to have freshness information inresponse headers?</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1373</link>
    <description>Here is what I want to do: I want to turn off add-cache-info which I
currently use to see if the page I am browsing is fresh (x minutes ago
added to the page HTML), and instead I want some Firefox extension to
present that information in Firefox statusbar.

That's because the AddCacheInfo thing often breaks the page layout.

For that I would need WWWOFFLE to have the "x minutes ago" information
in some HTTP header in the response it gives to the browser. Is that
possible, and if not now, can you consider to implement it?

</description>
    <dc:creator>Miernik</dc:creator>
    <dc:date>2007-07-03T17:35:20</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1371">
    <title>URL-specification `default'</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1371</link>
    <description>I just wanted to change the setting for purging from -1 to an age of
2m.

From &lt;http://localhost:8080/configuration/Purge/compress-age&gt; I
followed the link to the description of the URL-Specification,
&lt;http://localhost:8080/configuration/#URL-SPECIFICATION&gt;, where I
found this example:

,----
|    *://*/*
|           Any protocol, Any host, Any port, Any path, Any args (This
|           is that same as saying 'default').
`----

Apart from the wording (should be "THE same", I suppose), when I
entered "default" for the URL and requested the change to be
performed, I got a page complaining that the change failed as a URL
specification had been expected but "default" received.  (It worked
with "*://*/*".)

User bug?  Documentation bug?  Or should wwwoffle be changed to allow
that syntax?  This is with version 2.9a.

Best regards,

Albert.



</description>
    <dc:creator>Albert Reiner</dc:creator>
    <dc:date>2007-07-01T09:22:57</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1367">
    <title>Recursive fetch with links to identical items</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1367</link>
    <description>I just wanted to make sure my understanding is correct: 

Suppose I am doing a recursive fetch with a high depth, and many of
those pages are interlinked or contain, e.g., references to the same
graphics files (logos etc.).  Will wwwoffle be so smart to skip the
pages it already downloaded in this batch, or will it simply continue
following links until the depth is exceeded?

My assumption is that wwwoffle is smart enough not to fetch the same
file more than once during a single run.

On a related note[*]: Is there any defined relation between the order
of the output of `wwwoffle-ls outgoing' and the order in which pages
are fetched?

Thanks in advance for any light you may shed,

Albert.


[*] The relatedness stemming from the fact that the same graphics file
    URL kept appearing and vanishing again from the top position of
    the list of outgoing URLs, i.e., `wwwoffle-ls outgoing | head -1'.



</description>
    <dc:creator>Albert Reiner</dc:creator>
    <dc:date>2007-06-23T15:04:18</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1358">
    <title>bug: disable-script can miss "&lt;/script&gt;"</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1358</link>
    <description>for this document:
----------------
&lt;html&gt;
&lt;script type="text/javaScript" language="javascript"&gt;
    document.write('&lt;input type="hidden" name="oRef" value="' + document.referrer.replace(/"/gi,'') + '" /&gt;');
&lt;/script&gt;
aaa
&lt;/html&gt;
-----------------

setting disable-script = on 

produces the following:
----------------------
&lt;html&gt;
&lt;!-- WWWOFFLE (disable-script) - script type="text/javaScript" language="javascript" --&gt;
&lt;!-- WWWOFFLE (disable-script) - ... --&gt;
----------------------

this is wrong.

the real world example is here:
http://www.microsoft.com/downloads/details.aspx?FamilyID=aea55f2f-07b5-4a8c-8a44-b4e1b196d5c0&amp;displaylang=en

looks like parser cannot detect the script end

</description>
    <dc:creator>Maxim Kirillov</dc:creator>
    <dc:date>2007-06-19T04:13:10</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1353">
    <title>is an URI with &amp; and no ? valid?</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1353</link>
    <description>I have a problem with a website:

Sniffed direct connection headers:

GET /register.html&amp;termsread=1&amp;agree_to_terms=1 HTTP/1.0
User-Agent: Wget/1.10.2
Accept: */*
Host: www.moneymakergroup.com
Connection: Keep-Alive

HTTP/1.1 200 OK
Date: Sun, 17 Jun 2007 04:45:20 GMT
Server: Apache/2.0.52 (Red Hat)
X-Powered-By: PHP/5.2.2
Connection: close
Content-Type: text/html; charset=UTF-8


Now the same through wwwoffle:


GET /register.html%26termsread=1%26agree_to_terms=1 HTTP/1.0
User-Agent: Wget/1.10.2
Accept: */*
Host: www.moneymakergroup.com
Connection: close

HTTP/1.1 301 Moved Permanently
Date: Sun, 17 Jun 2007 04:44:50 GMT
Server: Apache/2.0.52 (Red Hat)
X-Powered-By: PHP/5.2.2
Location: http://www.moneymakergroup.com/register.html&amp;termsread=1&amp;agree_to_terms=1
Content-Length: 0
Connection: close
Content-Type: text/html; charset=UTF-8

GET /register.html%26termsread=1%26agree_to_terms=1 HTTP/1.0
User-Agent: Wget/1.10.2
Accept: */*
Host: www.moneymakergroup.com
Connection: close

HTTP/1.1 301 Moved Permanently
Date: Sun, 17 Jun 2007 04:44:51 GMT
Server: Apache/2.0.52 (Red Hat)                                                                                                                                                                                                                                           X-Powered-By: PHP/5.2.2
Location: http://www.moneymakergroup.com/register.html&amp;termsread=1&amp;agree_to_terms=1
Content-Length: 0
Connection: close
Content-Type: text/html; charset=UTF-8

.... and so on redirected back and forth endlessly.

Is this WWWOFFLE's fault of the website? Any workarounds (providing I want it
proxied and cached)?

Why does WWWOFFLE want to substitute &amp; with %26 so much?

</description>
    <dc:creator>Miernik</dc:creator>
    <dc:date>2007-06-17T05:03:46</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1349">
    <title>Access to proxy&lt; at &gt;8080 or 8082</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1349</link>
    <description>The way I use wwwoffle is to have two images running at the same time;  one configured to be permanently off-line(8082), the other(8080) permanently on-line, but taken off &amp; on periodicly to keep the indexes from getting too large.

What I am wondering is if there is a way (say on the browser bar) to address one or the other to fetch a url explicitly;  i.e.
    http://localhost:8082/get?http://news.bbc.co.uk/
    to get the cached version


or 
    http://localhost:8080/get?http://news.bbc.co.uk/
    to get an on-line version.

Pardon if this is a naive question:
   How does htdig do it?
   Does wwwoffle have to be offline when htdig is running?


</description>
    <dc:creator>Joshua Fein</dc:creator>
    <dc:date>2007-06-16T06:24:31</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1347">
    <title>multi-letter state symbols for info/content</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1347</link>
    <description>In http://localhost/info/content?&lt;URL&gt;

  'X' if the URL is in the DontGet section of the configuration file,
  '+' if the URL is in the cache, '~' if it has been requested and '-'
  if it is not cached.

Actually there should be several state indicators:
[DC] in DontGet, but already in the cache, [dC] not in DontGet, but
already in the cache. Wait, there are also images=no images, so a
third field, "not to be fetched", is called for, etc.

As many fields as necessary to nail down (fully describe) the status
of the URL. As this isn't a page we look at all day, it's OK to fill
it up with more fields.

Otherwise users will be baffled as to why images show up as "+", even
though one has images=no (today, but not last week!)

P.S., just like "ls -l" "drwxr-xr-x" the future fields should line up
nicely, no matter what their toggle/state.


</description>
    <dc:creator>jidanni-8D0D3YcSAvhAfugRpC6u6w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2007-06-15T03:07:57</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1346">
    <title>lining up the columns in index/monitor</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1346</link>
    <description>Gentlemen, how should the WWWOFFLE author best line up the URLs etc.
in index/monitor?

#  [L=10:6;N=31:0] http://traffic.tccg.gov.tw/
#  [L=105:5;N=31:0] http://wiki.debian.org.tw/
#  [L=33:2;N=25:17] http://wiki.debian.org.tw/index.php/CurrentEvents
#  [L=105:2;N=17:17] http://wiki.debian.org.tw/index.php/Main_Page
#  [L=165:7;N=15:17] http://wiki.debian.org/OngoingTransitions
#  [L=13:2;N=16:17] http://wiki.boringhost.com/index.php/Talk:MediaWiki

Perhaps by using a printf(3) statement with constant widths for the
numbers?

Hmmm, currently we see
&lt;li&gt;&amp;nbsp;[L=126:0;N=31:0]&amp;nbsp;&lt;a hre...
perhaps stick those nbsp's inside the braces, replacing the spaces you
just made with the future printf with them, or go the &lt;PRE&gt; route
instead of &lt;UL&gt;.

Surely in  http://www.useit.com/alertbox/ somewhere it must say
"Non-lined up columns are bad news. 66% of users over the age of 66
reported nausea and confusion."


</description>
    <dc:creator>jidanni-8D0D3YcSAvhAfugRpC6u6w&lt; at &gt;public.gmane.org</dc:creator>
    <dc:date>2007-06-15T00:12:26</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1336">
    <title>how much space is needed for htdig index?</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1336</link>
    <description>Can someone who uses htdig with wwwoffle can tell me for example how
much disk space the htdig index etc files take for a corresponding
wwwoffle cache size?

My wwwoffle cache is 1.85 GB and I have only 0.35 MB of free disk space,
and I wonder if I need to get more free disk space before starting the
indexing, or will it be fine? I never ran htdig before, so I don't have
an estimate how much it might be.

</description>
    <dc:creator>Miernik</dc:creator>
    <dc:date>2007-06-06T08:37:11</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1332">
    <title>could WWWOFFLE not make AAAA DNS requests if itdoesn't bind to any IPv6 addresses?</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1332</link>
    <description>Currently I think WWWOFFLE makes AAAA DNS requests even if it doesn't
bind to any IPv6 addresses, and there a no IPv6 addresses on the system
at all. These AAAA queries are completely useless, just waste a bit of
time. At least could there be an option in the config file not to make
any AAAA DNS queries? As on a system without IPv^ connectivity (as most
systems are), it could't use the result anyway even if it got a response
from the DNS server (in most cases it doesn't). On systems with long RTT
to the Internet (think satellite connection), superflous DNS queries
generate significant delays.

</description>
    <dc:creator>Miernik</dc:creator>
    <dc:date>2007-06-02T10:24:54</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1331">
    <title>POST value is lost if URL aliased?</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1331</link>
    <description>In my Alias section I have:

    http://www.linuxbios.org/* = http://linuxbios.org/*

When I go to http://linuxbios.org/mailman/options/linuxbios
and I fill in the Email address: and Password: the form is posted by the
browser to the URL with www:

&lt;FORM action="http://www.linuxbios.org/mailman/options/linuxbios" method="POST" &gt;

WWWOFFLE redirects that to
http://linuxbios.org/mailman/options/linuxbios which is what I wanted,
but without the POST value, it just makes a GET query to that address,
which is not what should be done in such case, why doesn't it POST my
data to http://linuxbios.org/mailman/options/linuxbios ?
Is that a bug, or is this intentional?

Jun  2 21:15:55 tarnica wwwoffled[11468]: Forked wwwoffles -real (pid=14820).
Jun  2 21:15:55 tarnica wwwoffles[14820]: URL='http://www.linuxbios.org/mailman/options/linuxbios?!POST:c41q0LTjwleIkc1AGsN9Qw.4661c1eb'.
Jun  2 21:15:55 tarnica wwwoffles[14820]: proto='http'; hostport='www.linuxbios.org'; path='/mailman/options/linuxbios'; args='!POST:c41q0LTjwleIkc1AGsN9Qw.4661c1eb'; user:pass='(null):(null)'.
Jun  2 21:15:55 tarnica wwwoffles[14820]: Aliased URL='http://linuxbios.org/mailman/options/linuxbios'.
Jun  2 21:15:55 tarnica wwwoffles[14820]: Aliased proto='http'; hostport='linuxbios.org'; path='/mailman/options/linuxbios'; args='(null)'; user:pass='(null):(null)'.
Jun  2 21:15:55 tarnica wwwoffles[14820]: Client bytes; 632 Read, 2169 Written.
Jun  2 21:15:55 tarnica wwwoffled[11468]: Child wwwoffles exited with status 0 (pid=14820).

Jun  2 21:15:55 tarnica wwwoffled[11468]: Forked wwwoffles -real (pid=14821).
Jun  2 21:15:55 tarnica wwwoffles[14821]: URL='http://linuxbios.org/mailman/options/linuxbios'.
Jun  2 21:15:55 tarnica wwwoffles[14821]: proto='http'; hostport='linuxbios.org'; path='/mailman/options/linuxbios'; args='(null)'; user:pass='(null):(null)'.
Jun  2 21:15:55 tarnica wwwoffles[14821]: Not requesting URL (Last changed 00:04:20 (260s) ago, config is 00:00:-1 (-1s)).
Jun  2 21:15:55 tarnica wwwoffles[14821]: Cache Access Status='Cached Page Used'.
Jun  2 21:15:55 tarnica wwwoffles[14821]: Modifying page content of type 'text/html; charset=us-ascii'
Jun  2 21:15:55 tarnica wwwoffles[14821]: Client bytes; 673 Read, 5199 Written.
Jun  2 21:15:55 tarnica wwwoffled[11468]: Child wwwoffles exited with status 0 (pid=14821).

</description>
    <dc:creator>Miernik</dc:creator>
    <dc:date>2007-06-02T19:26:08</dc:date>
  </item>
  <item rdf:about="http://comments.gmane.org/gmane.network.wwwoffle.user/1327">
    <title>Release of sho_title0.9</title>
    <link>http://comments.gmane.org/gmane.network.wwwoffle.user/1327</link>
    <description>The indispensable Post-Script:

P.S

The download URL is:

https://sourceforge.net/project/platformdownload.php?group_id=182008

Thanks again,
Joshua Fein


</description>
    <dc:creator>Joshua Fein</dc:creator>
    <dc:date>2007-03-18T12:41:24</dc:date>
  </item>
  <textinput about="http://search.gmane.org/?group=$group=gmane.network.wwwoffle.user">
    <title>Search Engine</title>
    <description>Search the mailing list at Gmane</description>
    <name>query</name>
    <link>http://search.gmane.org/?group=$group=gmane.network.wwwoffle.user</link>
  </textinput>
</rdf:RDF>
