TriNetre - Archive for October 23, 2003
(no longer updated)
There have been a lot of discussion regarding the working of download accelerators (DA) and whether they actually do work as "advertised". They do, or at the least, they should - theoretically.
The basic way in which a DA works is explained better using a more real-life example. Suppose you have a narrow road from C to S.
--------------------------- C Road S ---------------------------You need to get some huge stones from S to C. It is easy to see that the job is done faster by an army of (say) 10 men working simultaneously than a lone man working on his own. Though this is a very simplified way to explain how it works in in DA, there are similarities.
Let us look a bit deeper on how a DA works, assuming HTTP download. Let us simplify it further (for now) by assuming that the file is from one single server.
When you need to get a file (stones) from server (S) to client (C), you start an HTTP connection using TCP. The problem (?) with TCP is that it is a very friendly, conservative protocol. It makes no assumption on what kind of connection exists between C and S. So, in the beginning it starts by transferring a small amount of data from S to C. When that works fine (without any packet drop), TCP increases the amount of data transferred next time by a small amount and this goes on until the whole file is fetched. A lonely man at work.
What a DA does is simple. First it finds out the total size of the file to be fetched. Next it calls up its army of 10 connections (men). It then divides the file into 10 parts and tells each of them which part of the file to fetch. Then each of these connections go about doing their work in parallel, fetching their part of the file. In HTTP, this is done using the "Range" specification. Each of these connections work like a single connection. However, collectively it is able to fetch more data at any point in time. If the lonely worker was able to get 10 bytes of data in the first fetch, the army would be able to get 10*10 byes of data. Thus collectively, it is able to fetch the whole file in less time.
More evolved DA has more tricks up its sleeve. Sometimes, it is the server that is a bottleneck. Think about a Slashdotted server. It might be better to leave alone (or sparingly use) an overtaxed server. Similarly, if 5 servers have the same file, it might be better to distribute the work to these 5 server, as long as these 5 are equally responsive. So a DA finds out mirrors of the file and each parallel connection tries to fetch its piece from different server. Smart, aren't they?
If it is such an open and shut case, why so much controversy? Well, because theory and practice are not always best of friends. If your machine is such that it is low on memory or CPU power, it might actually take more time to get your army out on foot and do their work than get a lonely hard working connection do its job. Furthermore, whatever be the number of connections you use, there is a limit to the amount of data that can be fetched at any time, limited by the bandwidth of the connection. You know, the road is only that wide. If you send more soldiers, they will just knock against each other and the stones will fall off from their hands. Another thing to take into consideration (if you are a protocol buff) is the overhead involved when you use more connections. Think about the 3 way handshake, the 4 packet connection close and ACK packets flowing across the network for each connection.
If you are interested in open source DAs, check out Download Plus! (Windows) and Prozilla (Linux).A nice interview of Illiad (aka J.D. Frazer, the guy who draws UserFriendly comic strip) at LinuxWorld Magazine.
Illiad: You will find a copy of Photoshop. However, you will also find a copy of GIMP. I have a dual-boot system. I'm learning SuSE Linux as well as XP. And the only reason I don't use GIMP entirely - I've already spoken to the GIMP developers when I was in Germany, and they're really cool - but what I need is for GIMP to be completely pre-press ready, it needs all of the bits and pieces that Photoshop does really well. And when GIMP does that, I can punt Windows and I can punt Photoshop forever and I'll be really happy with that. Up until that point, I use GIMP from time to time. I don't necessarily use it in day-today production, but I do keep up with it, to see where those guys are headed with it, 'cause I'm pretty pleased with it.
..
WM: I've been waiting for you to address the whole "GNU Linux" controversy. Illiad: That is going to happen. I've sort of been reserving that for a much longer story arc because I've met Richard Stallman and he's really bright, and there's no question in my mind that the open source movement, or free software movement, needs someone with completely inflexible ethics and morals and philosophy. But oh my God, some of the things Richard does just blow me away. I've watched people walk up to him with glowing admiration in their eyes, and five minutes later they walk away in near tears because...
LWM: Because he's told them they're scum, they're working for money. Illiad: (laughing) Yeah, right.
..
LWM: A frequent butt of humor in your strip is the clueless tech support call. Isn't a lot of it an issue with software that's simply poorly written or supported?
Illiad: I agree. I've actually done a few strips where it was clearly Greg who was being the jackass, not the caller. I do see that side of it. What's unfortunate is that I don't think a lot of geeks see that, that there's a significant percentage of the geek audience that believes that they're [the callers] the only ones who could ever possibly be wrong in a help desk situation. I suppose the cartoon strip reflects that.
