Skip to main content

Blasted DST and shoddy software

Posted in

I've manually corrected the timezone that this software thinks I'm in.

Taking out the trash

I think I have sensible session pinning/unpinning now.  Some ad-hoc page reloading and timing out connections and waiting later, the session count is down to one event pending (the garbage collector), one file open (the listen socket), and one extra (the loopback application server), so it appears that there is no more session leak.   I do appear to have a memory leak somewhere, though, as seen by the process's resident set size increasing when new clients connect and not decreasing when they disconnect.  I'll have to go in with a profiler at some point and see what's going on there.

After some rework on the module boundaries, I have made the Halley::Client module very light-weight, just an adapter really, and the Halley::httpd a little bit heavier.  This will make it easier to make other protocols and interfaces pluggable by the application developer.  Some possibilities that come to mind include a console via UNIX domain socket and an XML-RPC or SOAP server that feeds events to the application server via the same Halley::Client interface as web browsers.  Since these each have a different connection/session model than the browser client, putting the management of the Client sessions into the protocol-specific module makes sense.

The application framework now sends halley.connect and halley.disconnect events to all registered applications whenever a client session connects or disconnects.  For the browser client, it's based on a timeout; in fact, it's based on not being garbage-collected.  The details here may change later but the idea seems workable.  The client's POE session ID is sent along with these events.  That's how the application should keep track of connected clients.  Application session management should be layered on top of those client sessions.

So far, so good.

Next steps: Make the sample application keep a list of clients that are connected, and broadcast changes out to all connected clients.  The client should update its list of connected users every time it gets an update event.  After that, it should be a short step to a public chat room.

Hello? Are you there?

When is a client connected?

The client-side code can't guarantee that a "disconnect" event or request will reach the server when the page is closed. The client-side code can't control which socket each request uses. The client-side code can't even close sockets when they're no longer in use. In fact, the client-side code can't do much of anything but send requests and hope they work out.

So, really, the only thing that the server can know is, at the instant when a request is received from a client, that client is probably connected. (Probably. Who knows? It might have crashed right after sending the request.) Even the browser closing a socket doesn't mean much as maybe it just decided to do so; anyway, it's not under the client-side code's control.

This leads to the conclusion that the Halley server-side must explicitly pin and then garbage collect Halley::Client session objects and, maybe, Halley::httpd::filter's session objects too. The latter should clean themselves up as the OS decides that the connections are gone but I don't know if it's safe to assume this.

Here's the plan. Whenever an Halley::Client (HClient) for a given browser tag can't be found, a new HClient will be created, its refcount will be incremented, and it will be marked in the master HClient list as being in use. Whenever a HClient for a given browser tag is found, it will be marked in the master HClient list as being in use. Then, periodically, the master HClient list will be scanned. Any entries not marked in use will be reaped by decrementing the refcount and removing the entry. Any entries marked in use will be marked as not in use. Pretty straightforward mark-and-sweep. The POE garbage collection should then discard the sessions as soon as it's safe to do so.

Loopback, multiple events, and IE 6 support

The loopback test works. Typing into a field causes a message to go through the client, the server, the application, and back to the server, the client, and a different field.

Event batching works. If the server is suspended, the browser queues up events and sends them as one batch as soon as the server comes back. (Well, that's not quite true -- as the POST body is generated when the client first decides to send a message; the second event and on are batched into one round-trip that immediately follows the first one.) The quick succession of replies to the batch of events from the client are sent by the server in one batch, too.

And, after a bit of tweaking to work around the fact that the XMLHTTP object is an ActiveX object instead of a JavaScript object, meaning it won't accept new member fields, the client-side part of Halley works in Internet Explorer 6. Other versions not yet tried, but I expect (hope?) it will work in IE 7, which pretty much rounds out the market.

It does work fine in Safari on the iPod touch. And of course it works in Firefox 2.

Next steps: I think it's about time to see how an application can be passively aware of what clients are connected at any given moment. Pretty soon that will mean I can't avoid the question of garbage collecting Halley::Client session objects, and the Session objects spun off by Halley::httpd, any longer.

Batching events and the wire representation

After a bit of a hiatus, I'm back to working on Halley. Today I am finalizing the wire representation of events between the Halley server side (Perl) and the Halley client side (JavaScript). This only affects application developers in that they will see a very nice interface for the two halves of their application to communicate with each other, so while the internals won't affect users, it would be good to get them solidified so I won't be as tempted to change it again.

On both sides, when the application sends a Halley event to the other side, the event will be queued up and then event queue processing will be queued. Let's be more clear. A Halley event is queued by the application. The Halley framework places that event in the send queue. Then it schedules the send queue to be processed -- with setTimeout for the browser side and by queueing a POE-event on the server side -- as soon as reasonably possible. Why the double queueing? If the application is firing off events rapidly, this will allow the communication to happen in larger batches, reducing the number of round-trips, and (I am guessing, but it sounds reasonable) reducing the overall latency.

The Perl application sends by executing:
POE::Kernel->post($client, 'send', $event_type, arg1, arg2, ...)
where arg1, arg2, ... are any number of arguments of any type. (Of course, the arguments must be encodable as JSON objects, but that should be obvious and not too strong of a restriction for most cases. Anything that's not a subref should be okay.) This is then queued up as a Perl hash of the form { event => $event_type, args => [ arg1, arg2, ... ] }.

A future enhancement might remove the 'send' POE-event type and have the POE-client pass on any POE-events it doesn't have reserved (received on a _default event handler) but this is not critical or difficult.

When it comes to send the pending events, assuming there was a second event with no arguments queued after the above, the JSON object transmitted to the browser-side code will be:
[ { "event": "event_type", "args": [ "arg1", "arg2", "..." ] },
{ "event": "another_event_type", "args": [ ] } ]

This is simply the string that results from JSON-encoding a Perl array of the above internal form.

The JavaScript application sends by executing:
halley.sendEvent(event_type, arg1, arg2, ...);
where again, arg1, arg2, ... are any number of arguments of any type. Anything sensible (i.e. not functions) should be okay. This is then queued up as a JavaScript object of the form { event: event_type, args: [ arg1, arg2, ... ] }.

The wire representation of events from the browser to the application server is the same as for the other direction, and is simply the string that results from JSON-encoding a JavaScript array of the above internal form.

Syndicate content