Snippets code from my daily experience

December 22, 2008

Downloading protected resources using nsIChannel and friends

Filed under: babelzilla,extension,komodo,macro,nsIChannel,nsIStreamListener — dafi @ 2:35 pm

A couple of days ago I needed to automate file downloading from a service, a very trivial task in every programming language (or using wget).

The little complication was represented by the web based authentication mechanism (userid/password) needed to access to files.
Determining which files to download and their usage (unzipping and picking files) required some specific business logic, nothing really complicated but very annoying.

After a while I realized this job can be done using Javascript and XPCOM and here I would share the solution based on nsIChannel.

What is does

  • the service uses userid and password to login
  • login to the service using HTTP POST method, it simulates an HTML <form/> submission
  • store the cookies sent from server. They contains credentials login data, cookies usage is a prerequisite in our scenario
  • reuse login data to download protected resources

Example usage

Suppose you want to automate download of extensions stored on AMO‘s sandbox, you must login first so this use-case is perfect for us.
Someone can consider this approach ugly, web services or Remora API are better but here I only want to demonstrate how to use nsIChannel.

Let’s start…

You can write the Javascript code shown below to download my extension RichFeedButton (dropped on sandbox an year ago)

var amoUsername = "dafi@localhost";
var amoPassword = "my_secret_code";
downloadProtectedResource(
   "https://addons.mozilla.org/it/firefox/users/login",

   "data[Login][email]=" + amoUsername + "&data[Login][password]" + amoPassword,

   "https://addons.mozilla.org/en-US/firefox/downloads/file/33926/richfeedbutton-0.0.21-fx.xpi",

   "/tmp/richfeedbutton-0.0.21-fx.xpi");

where our downloadProtectedResource function signature is shown below

function downloadProtectedResource(loginUrl, postData, resourceUrl, destPathName) { ... }

Nothing special, simply we need to know the HTML input names used for userid and password (ie data[Login][email] and data[Login][password]) and pass them in postData argument.

The downloadProtectedResource interacts with nsIChannel (nsIHttpChannel) and other XPCOM object

function downloadProtectedResource(loginUrl, postData, resourceUrl, destPathName) {
   var httpChannel = makeHttpChannel(loginUrl); // create an object nsIHttpChannel
   var stream = makeStringStream(postData); // create an object nsIStringInputStream
   setChannelPostData(httpChannel, stream); // fill data using nsIUploadChannel

   // downloader saves data on disk
   var downloader = new Downloader(resourceUrl, destPathName);
   // make a login then passes cookies to downloader object
   var cookieListener = new CookieRetrieverListener(downloader);

   // start authentication and download
   httpChannel.asyncOpen(cookieListener, null);
}

The object Downloader and CookieRetrieverListener implement the nsIStreamListener interface.

The cookieListener after obtaining cookies aborts the operation because we don’t need all server output, then it calls the downloader.

function CookieRetrieverListener(downloader) {
   this.downloader = downloader;
   this.cookies = "";
}

CookieRetrieverListener.prototype = {
   onStartRequest: function(request, ctx) {
         var channel = request.QueryInterface(Components.interfaces.nsIHttpChannel);
         this.cookies = channel.getRequestHeader("Cookie");

         // no need more data
         throw Components.results.NS_ERROR_ABORT;
   },

   onDataAvailable : function(request, context, inputStream, offset, count) {
   },

   onStopRequest: function(request, ctx, status) {
      this.downloader.cookies = this.cookies;
      this.downloader.start();
   }
}

Another way to use this code consists to download localizations from BabelZilla as shown below.
BabelZilla requires many parameters on query string😕


var bzUsername = "dafi_duck";
var bzPassword = "my_secret_code";
var bzItemId = "88";
var bzExtId = "4432";
downloadProtectedResource("http://www.babelzilla.org/index.php",
                 "op2=login&lang=english&message=0"
                        + "&option=ipblogin&task=login&0b14737c5ade1f7697a8f81b33b0bacf=1"
                        + "&option=com_frontpage&Itemid=1"
                        + "&username=" + bzUsername
                        + "&passwd=" + bzPassword,
                 "http://www.babelzilla.org/index.php?option=com_wts&type=downloadtar"
                        + "&Itemid=" + bzItemId
                        + "&extension=" + bzExtId,
                 "/tmp/vsw.tar.gz");

nsIChannel.asyncOpen

Accessing to cookies received from server requires to use nsIStreamListener available only in asynchronous open calls.

This needs to start the download only when cookies are surely retrieved, this is achieved using the nsIRequestObserver.onStopRequest, any better idea is very appreciated.

Complete code

The complete code contains a few of helper functions (reading binary stream, saving file) and is available on SVN, it’s ready to be executed as Komodo macro simply setting userid and password.

Blog at WordPress.com.

%d bloggers like this: