In comp.os.linux.advocacy, Handover Phist
<jason@xxxxxxxxxxxxxxxxxxxxxx>
wrote
on Wed, 16 Aug 2006 14:30:01 GMT
<slrnee6amf.5j4.jason@xxxxxxxxxxxxxxxxxxxxxx>:
> Roy Schestowitz :
>> __/ [ BearItAll ] on Wednesday 16 August 2006 14:47 \__
>>
>>> I wouldn't normally use a browser to download large files, much easier with
>>> wget. So I haven't hit the 2G limit before.
>>>
>>> But I was on a site this morning and the only way their gave was via the
>>> browser, though they did tell you that most browsers would hit a size limit
>>> including Firefox (of cause I missed that bit of text until I discovered
>>> the file wasn't fully downloaded, as you do). I thought that Opera would
>>> have no trouble, but it turns out that is the same.
>>>
>>> Gzilla is the one to use for these large downloads, it's working fine so
>>> far.
>>>
>>> But I just wondered if anyone had seen a reason for the download size
>>> limit? I can understand the limit of number of concurrent downloads,
>>> browsers are too big-n-bulky for that sort of job, but the file size limit
>>> doesn't seem to have a reason other than a number the browser writers
>>> picked out of the air.
>>
>> I can only think of the file size limit (4GB for NTFS) as a factor. I can
>> recall being forced to slice zip files to make file transfers possible.
>> Also, as my external hard-drive comes with a braindead filesystem, tars need
>> to be sliced and reassembled. Could the limit, which needs to be imposed
>> somewhere (probably not if properly implemented), be driven by some obscure
>> convention? Vista (beta 2) ISO is 3.5 GB.
>
> I could see MS Explorer implementing a 2 gig limit in `94 since fat32
> couldn't handle a larger file than that anyhoo. Perhaps other browser
> makers saw that as a standard. I wouldn't mind perusing the source code
> to find out why this would be, but Firefox weighs in at over 30 meg
> bzipped, and of course Explorer isn't available in source form.
>
> Does wget have a limit on the file size it can pipe to a filesystem?
>
I could see several spots for an implicit limitation,
but none for an explicit limitation, as wget presumably
just writes what it gets using the Portable Library,
C++ streams, or even just raw file descriptors with
a fixed-size buffer (of about 16k or so at most).
I'd frankly have to look at the source.
[1] wget calls strtol() instead of strtoll(), in those cases
where it bothers to process Content-Length:.
[2] wget is writing to ext2 or ext3 on a 32-bit box. This is
a limitation of course in ext2 or ext3, and only on 32-bit boxes.
[3] wget has an intentional test in an attempt to protect itself
because of signed/unsigned sloppiness. (Why it would need
such is far from clear.)
An 'ebuild net-misc/wget/wget-1.10.2.ebuild unpack' is
relatively simple, and gives me pre-patched Gentoo source.
(And people wonder why I love this distro. :-) There are
vaguely similar methods on Redhat and Debian, though.)
A quick scan shows that wget is heavily using C's Portable
Library, but not for the actual sock read; fd_read_body()
is indeed using a 16K stack-allocated buffer and passing
down ultimately to a little routine sock_read(), which
simply does a read() on the file descriptor -- the server
socket -- passed thereto.
The Content-Length is passed through a str_to_wgint() method.
This is actually a macro defined in wget.h, and can be
either strtoll() or strtoimax(). (strtoimax() is in the
POSIX 1003.1 standard, apparently.) So that's covered.
wget has an explicit long long limit, but that's about it.
Considering that the max value of a long long is
approximately 9.22 * 10^19, it's probably not a big worry.
Dare I unpack Firefox and do the same analysis? :-) It'd be
a little big...
--
#191, ewill3@xxxxxxxxxxxxx
Windows Vista. Because it's time to refresh your hardware. Trust us.
|
|