Welcome to the adventure

Summer 2006 plans

Tuesday April 25, 2006

What the heck am I doing for the summer?

When graduating, it seems like everyone’s favorite question to ask you is “what’re you doing next?”, and rightly so. I wish I had a simple one sentence answer like “going to Google for the summer, then grad school in the fall” (which was my initial plan). Alas, it is not so. Rather than answer it a million times, I thought it best to blog what it is I’m doing.

Be warned everything I’ve been planning has a ton of uncertainty, so I guess I’ll update as things progress.

Initially I applied to lots of grad schools as a PHD in Human Computer Interaction (HCI). I did this because there are some interesting problems out there to solve, and I think I have the background to contribute solutions to these problems. I’m also very interested in Computer Vision (I’m taking a grad course in object recognition during this final semseter). I was planning to decide which track to pursue when I got into grad school. I really want to use Computer Vision and apply it to transportation research, and interviewed for a summer position in a research lab doing just that.

After all of the grad school application dust settled, it turns out I had opportunities to do Computer Vision at Purdue and to do either HCI or Vision at UMD. My initial intention was to move away from UMD, but it really has great programs in just the two topics I’m interested in — Vision and HCI, so doing research at UMD would be a sensible option. My advisor is excited about what I’m working on now (pen-based digital annotation research) and wants me to keep doing it as an HCI PHD.

I was going to decline both schools because I still need time to decide if the PHD is worth it to me and fits my career goals. However, I accepted at UMD and deferred admission for a year. So if I do go to grad school, it will probably be at UMD in either HCI or Vision and I’ll enter in Fall 2007.

So what am I doing in the mean time?

I could work in industry for a year, and have gotten some interest from all of the major companies I’d consider working for, but I feel like that would be more of the same. While in college, I’ve interned twice for Microsoft and once for IBM, and while these are great tech companies, I feel like I have to do something that’s “mine” — start up and work on problems that I’m really interested in solving. I don’t really want to “work for a company” =) at least not right now.

So essentially, what I really want to do is start a company. This desire is what is causing me to reconsider grad school, because it doesn’t directly help me fulfill that desire, at least in the short term. The time is ripe! I have nothing to lose, lots of ambition, and tons of resources and connections. And most importantly, this is the first time in four years I’ll be able to work in areas I’m interested in completely uninhibited by classes and internships. I want to see what I can accomplish with my full attention on something.

The plan is to work where I am now in MD, for at least the summer, while trying to get a company off the ground. I have a ton of good ideas and a few leads on venture capital if I need it, but I’ll try and bootstrap the firm from nothing if I can, so I don’t have to give away equity (after all, Gates was a bootstrap entrepreneur). All of these conditions coming together at once (graduation, bootstrap cash, fresh connections with the major industry companies from internships) make the startup route irresistable.

The exact business plan isn’t in stone, and neither is the product. I definitely want to do a product-based company rather than service-based (and “online web service,” like digg, is defined as a product in this case). What I really want to do is productize what I’ve been working on as research, since it’s fun, useful, and no one has done it yet. I’ve also got some other software brewing on the back burner that could come forward as a product.

I’ll also be visiting family in friends that I haven’t seen in awhile, and might work at the university under my advisor full-time for a portion of the summer to finish some of the research I’ve been working on, and try and publish a paper for the CHI conference.

So that’s the gist. Should be exciting!

ASP.NET on-the-fly compilation and db4o

Sunday April 23, 2006

With db4o, you can persist objects form Java and .NET transparently to a database. You can then query for objects based on their class type.

If you use db4o when writing ASP.NET applications in Visual Studio 2005, you may run into problems. Visual Studio encourages you to put all of your “non-page related” classes in an App_Code folder. While your website is running, you can modify and replace the class source files in that folder, and they will be recompiled on the fly as they’re needed by webpages. This is a very cool feature. Here’s an overview of this and some new stuff in ASP.NET 2.0. As an alternative, you could precompile your site instead of having your code compiled as it’s needed when the site is running.

Anyway, back to db4o. Each time the classes in App_Code are recompiled into a new assembly, the assembly is renamed to have a hash in its name (the name is mangled to be unique), so that there isn’t a name collision. When storing and querying, db4o relies on the fully qualified name of your class and its assembly string, e.g.:

Namespace.Class App_Code.junkCharacters Version=0.0.0.0 ...

No big deal, right? Except that in ASP.NET, different “junkCharacters” are put in the assembly string every time you modify one of the files in App_Code. So, if you store a “Person” into the database, modify some other file in App_Code, and then try and query for “Person,” db4o will think that the Person class you’re querying with is different than the one you stored in the database, and the query will not return any hits.

To get around this, you need to have all of the classes you’re going to use with db4o built into an assembly with a stable name. In the full version of Visual Studio, you can just create another project that contains all of your non-page related code, and then reference it from your ASP.NET project.

Your VS solution might look like this:

- Website
*.aspx
*.aspx.cs
reference to ModelCode

- ModelCode (class library)
*.cs

More details can be found in this helpful post on the db4o forums.

Google calendar can read emails!

Sunday April 23, 2006

I just found this feature in Google Calendar which is going to save me a lot of time. Not sure if Outlook does something similar.

In Gmail, if you’re reading an email and click on “add event”, a new calendar event window will pop up. I have an email with the following text:

...
- Raffle prizes.
- FREE PIZZA!!!!

Date: Tuesday, April 25th
Time: 5pm - 6pm
Location: 3120 Computer Science Instructional Center (CSIC)

Don't Miss Out!
...

And the new event dialog had everything filled in perfectly - “when,” “where,” and even “what.” Pretty robust! Nice job Google.

It’s pretty obvious how they implemented this one; it’s a standard design pattern. Since the emails don’t always contain the strings “when: ” and “where: ” in them, you can’t parse them programmatically. Instead, specially trained human speed-readers work 24-7 at Google in Mountain View and read your email when you click “add new event.” They then fill in the fields in the event dialog with amazing alacrity. They must still be working out some of the bugs, because if you add too many events in too short a time, the reader guy gets tired and you can see him typing in the calendar event info as the dialog pops up. You have to be quick though.

Gentoo has no place on the desktop

Wednesday April 12, 2006

It’s been awhile since I switched all of my machines from the Gentoo Linux distribution to something else; about a year, in fact. Now that I’ve had some time to look back, I can articulate the reasons why I switched. Upon reflection, it’s clear that Gentoo Linux has no business being compared to the leading desktop distributions. It is good for entirely different purposes, like kernel and package development, but its advantages in these areas give it considerable disadvantages when used as a desktop distribution.

Popular reasons given for trying Gentoo:

  1. It’s very customizable
  2. You learn more
  3. It’s bleeding edge

Gentoo is very customizable
Yeah, it is. You can use “USE” flags when installing your software to compile in the features you want. So, for example, you can selectively disable printing support in all of your applications, which avoids installing the printing dependencies, and I guess saves you some memory in your application. I would always wonder why my application had some missing feature and would come to find I had neglected to set a USE flag when compiling the software. To fix, this would require me to: save my work and shut down the application (always annoying), find the right USE flag that matters (is it “mp3” for mp3 support, or “mpeg3”? Do I need any other arcane combination of USE flags to get this working?), edit some files to include the new USE flag, and then recompile the program. The recompilation can be impossibly annoying, especially on a big application like Abiword.

Secondly, I don’t _ever_ remember removing USE flags; instead, I was always adding them. To save myself some headache there really should have been an option to “include all USE flags,” which is essentially how the major binary distributions are compiled. I don’t care about a little extra memory here or there, or a few extra dependencies installed on my system. Life is too short to contemplate whether you use the spell checker or will in the future; it’s too short to decide whether you can handle a 3MB spelling library on your disk because your word processor requires it as a dependency when the USE flag is set. While USE flags are an interesting idea, for average desktop use they’re just added complexity.

You learn more
Yeah, you do learn more, because you’re forced to. To get support for hardware and 3rd party drivers, I ended up recompiling my kernel literally hundreds of times (across many systems). This is kind of useful, because now I know how the kernel is organized. But on the whole, I don’t consider staring at that ncurses menuconfig UI for hours, selecting options and trying not to forget anything, and issuing the same commands over and over again to recompile and reboot as time well spent. Sometimes, it’s nice to learn something new; but after that, it just becomes tedium. Gentoo forces you to stay with the tedium even after you’ve learned the concept. Again, while this is useful in some circumstances, like testing CVS kernel builds & modules and developing for the kernel, for desktop use it’s just a huge, incredible pain.

One part of Gentoo that you “learn about” is something no one really wants to learn about because no one cares, and that’s the intricacies of the package management system (Portage). Most of my time spent on meta-computing in Gentoo was editing the various mask definition files, which set rules for which versions of packages can be installed. I was trying to “coax” (more like fustigate) Portage into “letting” me install what I wanted and to not clobber my package the next time I try and upgrade. This can require unmasking a package and all of its dependencies, which, if you’ve ever tried to install an experimental version of GNOME, means tracking down tons of packages and their dependencies in turn and manually specifying the package names and versions to Portage. If you’re interested in using experimental software, this can become really time consuming, and often causes forced downgrading or broken software when doing a system upgrade. I know a lot about Portage now after fighting with it for a few years, but the knowledge is not something I intended to learn nor is it useful. It’s just extra cognitive noise that comes with using the system.

Bleeding Edge
See major deficiency number 1 in the next section.

Major Deficiencies in Gentoo
Aside from the spurious advantages of Gentoo, there are some of the major, fundamental deficiencies in the distribution that I’ve come to see in retrospect.

Constant breakage
Gentoo is bleeding edge, and the upgrades come fast and furious — so I upgraded often. Many times I was prevented from upgrading, usually with Portage telling me that something had changed in the package structure and some of my package upgrades were blocked. Great, thanks. How about you just take care of it? No, usually I have to manually uninstall the application that has been moved somewhere else, and then reinstall the moved version, so Portage’s worldview doesn’t fall apart.

While that is just poor usability (software should be taciturn with its problems), Gentoo breaks in more insidious ways. Sometimes it will start compiling software and the versions of the dependencies are such that you get compilation breakage. So instead of telling me the packages are incompatible, the compilation fails. Now I have to go searching through the compiler failures for some obscure reference in a source file that hopefully uses some key words that indicate which package is at fault. Lord help you if you’re outside of X and have to figure out a creative way to log the output of the build so you can review it with scrolling once the build fails.

As a software developer I understand the issues at hand here, but as a user of a distribution, they’re issues that I just don’t want to deal with — especially when I issue a system-wide upgrade command and go to bed, which is necessary due to the next problem:

Installing software takes time
A lot of time… if you use any applications with a large source base, beware! Upgrading something like Open Office can take hours. The general recommendation is “don’t watch it install, go do something else!” Before we got some nice scheduler upgrades in the kernel, “doing something else” with your computer was impossible. Now that Linux can handle compiling software and keeping a GUI responsive, that’s not so much of an issue. But, honestly, people don’t want to wait for software to install. Usually I won’t “foresee” my need for a program an hour before I need it, and responsibly tell Portage to get the software for me, so that in an hour it’s ready when I need it. If you abruptly install a new application, it’s because you need it now; Gentoo’s software model breaks another core tenet of usability: instant gratification. I don’t want to have to put distcc Knoppix in my car radio so that my car can help compile my software from the parking lot, but that’s often what is necessary to get anything installed in a reasonable amount of time.

Compilation is not free
Compilation is not free; treating it like it’s free is a fallacy. You tie up the power of your computer, usually for a couple of hours if you’re installing anything substantial (like the new Firefox), which means you can’t do anything else intensive for that time period. You’re forced to stop working in GIMP and go get a glass of chocolate milk while your system is busy getting the latest MonoDevelop, which you need installed to continue your work.

Secondly, there’s a ton of wear and tear on your system. Pushing the processor at 100% for hours at a time can cause some serious heat, and will usually crash a system that’s poorly cooled that would have otherwise run fine (or at least crashed at a later date). Many times I would wake up in the morning to find my machine completely hung because it overheated due to night compiles. Lots of heat wears down components and invariably generates more noise.

Possibly worse is the churn happening on your hard drive. Gigs of source code (source is always bigger than the binary) are downloaded to be compiled into a the final binary version, after which the source is then thrown away. Factor in all of the temporary files created during compilation and you’ve got significant hard drive churn that comes from the way applications are installed. The package management in Gentoo probably resulted in 10x more writes to the hard drives than I caused by modifying my own data and using my applications. More writes means shorter life on the hard drive, and no one likes to have a failed disk.

Compiling everything you install is already bad enough for system resources, but when Gentoo pushes out package upgrades, often they’re only small patch changes or package revisions, like (1.10.2 -> 1.10.2a) which require huge recompilations. The upgrades that come from small package changes almost never should require a full recompilation; package maintainers, how about you flag whether a package upgrade (a->b) requires a full recompile, thus avoiding the whole upgrade ordeal if possible?

There are many more things to be said, but will probably become irrelevant in the future because Gentoo is improving so quickly. The deficiencies I’ve outlined above are fundamental in that they’re intrinsically tied to the way Gentoo works, and so can’t really be “fixed” unless we get some significant technology changes. On the whole, it’s a well-engineered distribution with a thriving user community. It’s just not purposed as a desktop distribution. If that’s your usage model, Gentoo is not for you. The reason I ultimately moved away from Gentoo is because I realized it was not built to cater to my desktop usage model, and that there were other distributions out there that did the job better.

Monster hare will destroy rain forests

Saturday April 8, 2006

In a world where you have to fear giant centipedes eating your children, you now must worry about gigantism mutations spreading to mammals. Expect very soon to have monstrous angry hares ramming your car, pulverizing your cat and giving you mean, glowering glares while eating whole trees that stand around your house.

Monster bunny

Vicious kernel panic in Ubuntu Dapper related to Nautilus

Friday April 7, 2006

I finally found the source of the unbearable kernel panic I get every time I log in (via GDM) with Dapper pre-release. It’s been happening for about a month now. The cause:

A desktop wallpaper

That’s right. While we can build systems to recognize faces from low-dimensional subspaces based on the principal components of images; while we can construct scalable networks with no centralization that service billions of users; while we can cram a full PC into an LCD monitor, setting a desktop wallpaper still eludes us.

Sorry. Manually rebooting frequently after kernel panics can try one’s patience. The cause is either nautilus (which displays the desktop wallpaper) or the ATI drivers.

When using the “radeon” driver (OSS, non-accelerated), X is happy to simply crash out with no warning or log, but thankfully it leaves my kernel in a state of sanity.

When using the accelerated ATI drivers (”fglrx”), however, the crash occurs in the same place, but this time it takes the entire system with it. The kernel panic goes something like this:

kernel panic - not syncing: Fatal exception in interrupt

… and the rest of the error message? Well, the rest of the message is cut off at the bottom of my screen, because Ubuntu thinks my terminal is two lines bigger than my monitor, such that the top and bottom lines of terminal output are cut off and not displayed (working on the terminal framebuffer is a pleasure).

To work around this problem, I can ssh in from another computer, move the wallpaper file so that nautilus can’t find it when it starts up, and the next login goes smoothly (the wallpaper defaults to a solid brown color). It’s not the bitmap I’m using, either; it occurs with any bitmap.

Google has a modicum of information about this problem, so maybe this will help someone.

Calling managed code from native C/C++ using callbacks

Friday April 7, 2006

Callbacks are useful for notifying listeners when something happens, among other things. They’re implemented in .NET through delegates. It’s not immediately obvious how to make use of delegates when calling unmanaged code. The documentation is out there but it’s not easy to digest, so I’m going to synthesize it here. The example code is not complete enough to compile, but it should outline the main ideas.

Consider the somewhat contrived scenario where a C method performs a long running mathematical operation (generates a hash), and then calls back a registered listener when it has the result. If you’re writing C++ code, you can define the native method signature as such:

// This is what the signature of the method should look like:
typedef void (__stdcall *HashCallback)( int result );

// This listener wants to know when we're done computing the hash
HashCallback listener;

When the native code performs the hash, it can send the result to the listener:

void Hash(int a){
  int result = LongRunningHashRoutine(a);

  // Notify who's listening for the results
  (*listener)(result);
}

You can accept a handler like this:

extern "C"{
  __declspec(dllexport) static SetHandler(PVOID functionPointer){
    handler = (HashCallback) functionPointer;
}

You’ll want to use “extern C” to prevent the name of your method from being mangled by the compiler, so you can import it into managed code.

You can then platform invoke the native library, passing in a delegate that matches the signature of “HashCallback”. The following is C#:

// Define a delegate that matches the method definition of the HashCallback
public delegate void MyCallback(int result);

// Import the function that was exported from the native DLL
[DllImport("MyNativeCode.dll")]
public static extern void SetHandler(MyCallback callback);

public void Initialize(){
  // Give the native library a handle to our callback, ReceiveHashResult
  SetHandler(new MyCallback(ReceiveHashResult));
}

public void ReceiveHashResult(int result){
  // This method will get called by the native library when it's done with its "Hash" method.
}

Essentially, whenever the Hash method is called in the C++ DLL, it will “fire an event” and notify our managed code that it has just finished the hash computation.

References:
For more details and more code, see a very helpful post by Thomas Scheidegger

Daylight savings time should be destroyed

Sunday April 2, 2006

The man who invented daylight savings time is a thief and a liar. A thief because each Spring he steals one hour of my life, and a liar because, contrary to popular belief, time cannot “fall backward” or even “spring forward” outside of its regular, constant pace. The very notion is absurd.

There are some states in this union that have their heads on straight. The rest of us would do well to follow Hawaii, Arizona, and Indiana’s lead in abstaining from this abominable practice.

“next thing we know is they tell us to drive backwards” - Neha

Install latest Synergy on Ubuntu

Saturday April 1, 2006

Took me about 15 google searches to get the latest Synergy running on Ubuntu (dapper, should apply to and work with other Ubuntus) and I’ve found what I needed. Hopefully this will save someone some time.

The latest build (at the time of writing) in Ubuntu Dapper is 1.2.2 which is a full year old. With Synergy being so buggy (don’t get me started — it took until 1.3.0 to get this “bug” fixed — “Clients now detect loss of connection to server and reconnect”), it’s useful to have the latest build that hopefully fixes some of those bugs. The packages always seem to lag way behind the version for Synergy, so here are some tips to get the thing installed by source. The comments and instructions here are for Dapper pre-release and Synergy 1.3.0.

Update: If you have the latest Synergy in your apt repository (run “sudo apt-cache show synergy” to see) then you can just install that (”sudo apt-get install synaptic”) and not worrying about compiling it from source. But if the packages are outdated, you can compile it by source as described here to get the latest version.

You can try and convert their RPM release to .deb using

alien synergy*.rpm

(requires package alien), and install it with

sudo dpkg -i synergy*.deb

but the synergy executable produced always complains about a missing version of libstdc++.

Alternatively, you can grab it from source. The configure script will complain like so:

configure: error: You must have the XTest library to build synergy

Well, just install it then, right? Unfortunately, searching with synaptic for “xtest” in both description and name doesn’t yield any relevant results. Of course, that’s because you need to search for “xtst” (duh!) and there you will find libxtst-dev. How intuitive.

With that, configure should proceed, but “make” may bomb out in the middle with an error like this:

In file included from CXWindowsClipboard.cpp:15:
CXWindowsClipboard.h:24:3: error: #error X11 is required to build synergy
In file included from CXWindowsClipboard.cpp:21:
CXWindowsUtil.h:23:3: error: #error X11 is required to build synergy
CXWindowsClipboard.h:38: error: expected `)' before ‘*’ token

Thanks synergy for telling me I don’t have X11! And thanks for telling me this during the BUILD stage rather than the configure stage! Of course on Ubuntu you DO have X11. Synergy just needs to explicitly be told where the includes are, because for some reason, it can’t find them itself. You can do this via the arguments x-includes and -x-libraries to ./configure, as someone helpfully posted here.

So, in summary:

sudo apt-get install libxtst-dev
./configure -x-includes /usr/include -x-libraries /usr/lib --prefix=/usr
make
sudo make install

Update: this might also be of use-
Start Synergy with GDM on Ubuntu