Here is what is coming in the next few weeks:
I’ll attempt to describe my plan to the maximum detail possible. Please keep in mind some details might be altered and there are a few design decisions to make, but the core of this plan will not change.
The following is pseudo-code algorithm for the Firefox crcsync extension:
intercept outgoing http request
check the cache to see if there is a cached version of the url
if there is a cached version
calculate the CRC of the webpage
add the crcsync protocol headers
else
do nothing
intercept the http response and convert the response body to text/html
if it is a crcsync response (identified by the crcsync protocol response headers)
extract the response body
decompress the response body
force removal of no-cache headers to enforce caching
else
force removal of no-cache headers to enforce caching
The first part is easy, and all the header manipulating extensions (Tamperdata, liveHTTPHeaders,…) do this. All I need to do is to register an “http-on-modify-request” observer:
———————
var observerService = Components.classes["@mozilla.org/observer-service;1"].getService(Components.interfaces.nsIObserverService);
observerService.addObserver(obj, "http-on-modify-request", false);
———————
The process is also described in https://developer.mozilla.org/en/Setting_HTTP_request_headers.
In the crcsync protocol specification a new Content-Encoding is defined: crcsync. So i’ll modify the Accept-Encoding header to include crcsync, add new headers:
A-IM: crcsync
If-Block: <base 64 encoded hashes>
File-Size: <size of cached file>
And send the request.
For the second part of the algorithm, I will intercept the responses by registering a “http-on-examine-response” observer the same way as for the “http-on-modify-request”.
The tricky part is to modify the response body. I contacted a necko developer which confirmed me – “I don’t recall an API for doing this.” but he suggested a workaround using a StreamConverter used to convert between different content encodings.
If the reply contains the Content-Enconding: crcsync, the response body can be passed through the StreamConverter which will decompress the data.
To register a StreamConverter, the answer is in netwerk/streamconv/public/nsIStreamConverter.idl:
70 * Registering a stream converter:
71 * Stream converter registration is a two step process. First of all the stream
72 * converter implementation must register itself with the component manager using
73 * a contractid in the format below. Second, the stream converter must add the contractid
74 * to the registry.
75 *
76 * Stream converter contractid format (the stream converter root key is defined in this
77 * file):
78 *
79 * @mozilla.org/streamconv;1?from=FROM_MIME_TYPE&to=TO_MIME_TYPE
Which would translate to @mozilla.org/streamconv;1?from=crcsync&to=uncompressed.
Since the response has various “no-cache” headers and expiry date set in the past Firefox will not cache it. So the best option is to strip the response headers which cause the caching to fail.
The best way to force this caching is to strip the offending headers (the code can be taken from the BetterCache extension – http://netticat.ath.cx/BetterCache/BetterCache.htm).
Is it my opinion that the cache mechanism should be integrated with the regular Firefox cache, BUT on a separate CacheSession in the offline cache.
Right now I can think of one way to do it. By keeping the crcsync cache in the offline cache, we can assure that ONLY the crcsync responses are cached in this special cache. That way, the extension will only send the request headers if it finds an existing cache copy of the remote resource. If not, then the remote server is not crcsync-aware and there is no need to modify headers.
To use the cache for read / write access, I need to create a CacheSession wit the CacheService (@mozilla.org/network/cache-service;1). This CacheSession would run in parallel with the regular cache for non-crcsync enabled servers.
On to the performance. Most of the cpu-intensive tasks will be done in crcsync library, which will be compiled as a C++ XPCOM component.
Now there is an important design decision to be made. Should the rest of the extension be written in C++ or Javascript?
Although I haven’t written a line in Javascript and I have experience with C++, Javascript seems really easy and manageable.
I’ll probably do a bit of both.
My original idea to use other extensions as a base doesn’t seem as interesting as I thought it was. Most of these extensions have complex XUL interfaces, and I don’t think I need a GUI (only for on and off, but that can by done by the “disable” button in the addons dialog).
But of course, looking at their source code for ideas and for “robbing” snippets of code is still an excellent idea.
I expect to finish coding in July. In August I will do some serious testing and eventually improve the performance and efficiency of the protocol and the extension. During this phase I will also distribute the beta version of the extension to the http-crcsync developers.
I’m committing myself to maintaining it after the end of GSoC and upgrade whenever a new version of Firefox or the crcsync protocol emerges.