Read my book about the future of software!
Paper:


Free PDF:



Do Not Mess With This Guy

Faster Linux World Domination

This is a version of the mail posted to the lkml but with a couple of grammar things fixed.

“The future is open source everything.”
—Linus Torvalds

Dear LKML;

I have written a book that makes the case for Linux world domination. I find it interesting that the idea of Linux on the desktop is responded to by either yawns or derision. I think it depends on whether you see Linux as a powerful operating system built by a million-man army, or one filled with bugs and missing the cool stuff like speech recognition.

The points I wrote should be obvious to you all, but there are some pages on how to have Linux succeed faster I thought I would summarize here. Given this is such a high volume list, I figured it cannot decrease the signal to noise ratio very much! ;-) I didn’t see such emails are disallowed by the LKML FAQ.

I’ve been using Linux since mid-2005, and considering how much better things everywhere are now compared to then, it surely is an interesting time to be involved with free software. From no longer having to compile my Intel wireless driver or hack the xorg.conf, to the 3-D desktop, to better Flash and WMV support, to the countless kernel enhancements like OSS -> ALSA and better suspend/resume, things are moving along nicely. But this is a constant battle as there must be 10,000 devices, with new ones arriving constantly, that all need to just work. Being better overall is not sufficient, every barrier needs to be worked on (http://www.joelonsoftware.com/articles/fog0000000052.html).

The Linux kernel:

The lack of iPod & iTunes support on Linux is not a bug solved by the kernel alone, but Step 1 of Linux World Domination is World Installation. Software incompatibilities will be better solved as soon as the hardware incompatibilities become better solved. The only problem you can’t work around is a hardware problem.

If you hit a kernel bug, it is quite possible the rest of the free software stack cannot be used. That is generally not the case for other software. Fixing kernel bugs faster will increase the pace of Linux desktop adoption, as each bug is a potential barrier. If you assume 50M users running Linux and each bug typically affects .1% of those users, that is 10s of thousands of people. Currently, the Linux kernel has 1,700 active bugs (http://tinyurl.com/LinuxBugs). Ubuntu has 76,371 bugs (https://launchpad.net/ubuntu/+bugs). I think bug count goals of some kind would be good. Perhaps Ubuntu could focus on decreasing the number of bugs that aren’t also filed in the proper upstream bug database.

In general, Linux hardware support for the desktop is good, but it could get better faster. From Intel, to Dell, to IBM and Lenovo, to all of their suppliers, the ways in which they are all over-investing in the past at the expense of the future should be clear; the Linux newswires document them in detail on a daily basis. I was told by an Intel kernel engineer that his company invests 1% of the resources into Linux as it does to Windows. It is only because writing Linux drivers is so much easier that Intel is seen as a quite credible supporter of it. The few laptops by Dell that even ship with Linux still contain proprietary drivers, drivers that aren’t in the kernel, and so forth.

Peter Drucker wrote: “Management is doing things right, leadership is doing the right things.” Free software is better for hardware companies because it allow for more money to go into their pocket. Are they waiting for it to hit 10% marketshare first? I recommend senior IBM employees be forced to watch their own 2003 Linux “Prodigy” video (http://www.youtube.com/watch?v=EwL0G9wK8j4) over and over like in Clockwork Orange until they promise free, feature-complete drivers for every piece of hardware in the kernel tree before the device ships. How hard can it be to get companies to commit to that minuscule technical goal? In fact, it is hard to imagine you can be happy with a device without having a production Linux driver to test it with.

It is amazing that it all works as well as it does right now given this, and this is a testament to the general high standard of many parts of the free software stack, but every hardware company could double their Linux kernel investment without breaking a sweat. The interesting thing is that PC vendors that don’t even offer Linux on their computers have no idea how many of its customers are actually running it. It might already be at the point that it would make sense for them to invest more, or simply push their suppliers to invest more.

There are more steps beyond Step 1, but we can work on all of them in parallel.

And to the outside community:
* Garbage collection is necessary but insufficient for reliable code. We should move away from C/C++ for user-mode code. For new efforts, I recommend Mono or Python. Moving to modern languages and fewer runtimes will increase the amount of code sharing and the pace of progress. There is a large bias against Python in the free software community because of performance, but it is overblown because it has multiple workarounds. There is a large bias against Mono that is also overblown.
* The research community has not adopted free software and shared codebases sufficiently. I believe there are enough PhDs today working on computer vision, but there are 200+ different codebases (http://www.cs.cmu.edu/~cil/v-source.html) plus countless proprietary ones. I think scientists should use SciPy.
* I don’t think IBM would have contributed back all of its enhancements to the kernel if it weren’t also a legal requirement. This is a good argument for GPL over BSD.
* Free software is better for the free market than proprietary software.
* The idea of Google dominating strong AI is scarier than Microsoft’s dominance with Windows and Office. It might be true that Microsoft doesn’t get free software, but neither does Google, Apple and many others. Hadoop is good evidence of this.
* The split between Ubuntu and Debian is inefficient as you have separate teams maintaining the same packages, and no unified effort on the bug list.
* The Linux desktop can revive the idea of rich applications. HTML and Ajax improve, but the web defines the limits of what you can do, and I don’t think we want to live in a world of HTML and Javascript.
* OpenOffice is underfunded. You wonder whether Sun ever thought they could beat Microsoft if they only put 20 developers on it. Web + OpenOffice + a desktop is the minimum, but the long tail of applications which demonstrate the power of free software, all need a coat of polish. Modern tools, more attention to detail, and another doubling of users will help. But for the big apps like OpenOffice, it will take paid programmers to work on those important beasts.

There are other topics, but these are the biggest ones. I give away the PDF http://www.lulu.com/product/download/after-the-software-wars/6276446. I’ve talked to a number of kernel and other hackers while researching this and it was enjoyable and interesting. I cite Linus a fair amount because he is quotable and has the most credibility with the outside world ;-) Although, Bill Gates has said some nice things about Linux as well.

If you want to respond off-list, you can comment here http://keithcu.com/wordpress/?p=272.

Thank you for your time.

Keep at it! Very warm regards,

-Keith

The Continuing Adventures of Dubya

This video is funny.


Computer vision as codec

I’ve tried for a while to figure out why computer vision is mostly still in research labs in spite of the fact that there are many thousands of people and different algorithms and codebases for doing computer vision. One analogy that occurs to me is image compression.

There are an infinite number of ways of compressing an image, and each one gives a different result. In principle, we could have 1,000s of people around the world working by themselves on this very hard problem. But, it would be better to take a combination of the best ideas, and have everyone use that.

While codecs and computer vision seem quite different, they share an important similarity: in the pipeline of computer vision, from pre-processing to feature extraction, each step produces a smaller amount of data. At the end of the analysis you might be left with the data that this is an image of your house, which is just a few bytes. This compression is also precisely what a codec does.

Another similarity is that decoding is much simpler than encoding. Decompressing an image is faster than compressing it, and the encoders can typically get smarter while the decoder doesn’t even realize it. Likewise, we have plenty of software today that can generate a photo-realistic image of a house. The computer is doing the reverse process of what happens in our eyes.

So perhaps it could be that we have 1000s of computer vision people around the world taking an image and extracting the data, but it is some combination is the best. To be fair, this doesn’t tell us how hard problem is. Will it take the best ideas of 3 or 50 people?

To answer that involves look at each piece. Note that there is plenty of good free code for image processing, which is an important piece of computer vision. When it gets to lines and edges, it seems like that is less well decided. But I suspect that there are many ways to do this, but we should just pick some robust way and move on. [More here]

I’ve discovered the best codebase for people who want to work on computer vision is http://stefanv.github.com/scikits.image/index.html. It’s got Python-powered SciPy and DVCS.

So let’s get going.

Virtual Darpa Grand Challenge slide deck

I have had this idea for a Virtual Darpa Grand Challenge for a couple of years now and I’m shopping around this slide deck to angel / VC people. I don’t know many, but I am looking and learning.

But I thought I’d also put this out there to the Linux community and see what they think of it. I’ve never done anything like this before so I’m not even sure if I should take on this idea, but I’d be interested in hearing what people think of it and any advice on how to make it happen.

(The first few slides are background because I can’t assume someone knows about the benefits of free software.)

Thanks!

-Keith

Here is the latest version of the PDF you can also download.


Open Racing
View more documents from KeithCu.

Comment to Mark Shuttleworth

Mark announced he is stepping down as head of Canonical to work on design and quality, and this is what I wrote in his blog comments section:

You should focus on the buglist as one of your most important metrics, surely more important than boot time! You should use your Bully Pulpit and rally each team in the Ubuntu community, and those who work with Ubuntu, to get the buglist under control. Software that has bugs is like a house with 99% of a roof. It is impossible to have a quality product with an out-of-control buglist. I get concerned that every new release of Ubuntu has some ADD-like focus on a shiny new feature, and the fundamentals are being ignored.

The answer is not to be more or less “conservative” about software versions. I’m not going to argue that new Ubuntus are less reliable than previous releases as all releases have had bugs. (Breezy would sometimes hang on boot on my dual proc machine with some sort of race condition.) But it is not getting better. I live in Seattle, and if a Boeing crashes, it is very bad news. Pretend the same for Ubuntu.

Set goals and measure progress against them. Perhaps the most important is hardware because you can’t use any of the software until your hardware works. Step 1 of Linux World Domination is World Installation.

My book has more than one chapter full of advice for the Linux community.

Linux’s Growing Pains

The last two releases of Ubuntu (9.04 and 9.10) seem to generate a lot of complaints of bugs. The biggest problem is hardware because if your hardware doesn’t work, you can’t test out the software, and the hardware bugs are the hardest to find because nearly every computer on the planet is different!

I agree and I have found bad bugs in both releases. In 9.04, my Intel video driver was slow and leaked memory like a sieve. In 9.10, that problem has been fixed, but now sound volume maxes out at 26%, my mouse doesn’t work when coming out of sleep, and when I open the laptop lid, the backlight doesn’t always come on. (The last two bugs appear to have been fixed post-ship.) Of course, I’ve found bugs in every release.

The good news is that this is all very natural, and even to be expected given the deep changes that are being made to the stack. And for every user that has problems, there are users who have a better situation with the latest release. Bugs cannot be fixed until they are found, and they cannot be found until users are running the code. Ubuntu’s large user base means that it will find bugs not found by the upstream developers, which are mostly teams of just handfuls of people. And the bugs that Ubuntu users run into are nearly always bugs in the upstream code, so it isn’t entirely Ubuntu’s fault.

Solutions

Ubuntu has several choices. It can be conservative about what versions of software it runs, especially hardware-related components, so as to let the smaller distros find and fix the bugs. The downside of this is that Ubuntu is not carrying its own weight, and in general new software has new features that users want. What is the point of making a release every 6 months if it contains 6-month old software? A corollary of this is Mark Shuttleworth’s “cadence” suggestion: by having multiple distributions ship on the same day, they would presumably choose to use the same versions of software and share the load in finding bugs.

Some say that Ubuntu should ship less frequently, but a better solution is for Ubuntu to put more resources on the fundamentals. With every release, Ubuntu seems to have some shiny new trend it is talking about: cloud computing, new notifications, etc. and I worry they seem to become easily distracted and forget to keep fortifying their investments in the basics: Step 1 of Linux World Domination is World Installation.

Another reason that this problem is happening is that Ubuntu has chosen to be a separate team from Debian. Many of the bugs were found before the release of Ubuntu, but there just wasn’t enough people / time to track them down, and work with the upstreams to find a fix. And because Debian is a separate team, they are not engaged in this battle. (I have been complaining about the mistake of Ubuntu being a separate organization from Debian for 3.5 years so I won’t go into this any more.)

Linux is making good progress but has a ways to go still. Ubuntu currently has 74823 bugs. Focus on the bugs! You might not believe it, but at Microsoft we had it beaten into our heads to fix bugs: a bug meant an unhappy customer, and a bug that affected just 1% of users meant that there were millions of unhappy customers. Software that doesn’t work is not worth anything, and the bug list is the most important metric an organization could possibly be focused on. It is a problem that people in the Linux community talk much more about boot time than bug count.

Follow me @keithccurtis on Twitter.

Should Ubuntu have been created?

Software Wars

Amazing documentary called Dare to Dream

I just watched an incredibly inspiring, funny and interesting documentary about the history of women’s soccer. It is so good it should be shown in movie theaters!

If you are a woman, or like women, or have a daughter, you must check it out!

You can watch it on HBO:
http://www.hbo.com/apps/schedule/ScheduleServlet?ACTION_DETAIL=DETAIL&FOCUS_ID=627001

Or you can purchase it from Amazon:

My Mono e-mail response to RMS

I take a risk by crossing the street.

In order to know the risks attached to Mono, you’d have to know what people inside MS think. Of course, since you’ve not chatted with any MS employees, you have no way of knowing the actual risk. Mono is not the same thing as TomTom. It would be nice if you acknowledged in the future that you don’t know whether MS views Mono as a friend or an enemy.

Your fear-mongering reminds me of the Bush administration ;-)

You also don’t consider whether the patent issue can be easily worked around without throwing away all of Mono. Not all patent risks are the same. And do you actually know of any specific problems? Microsoft claims that there are hundreds of patent violations in Linux and so the safest thing is to run Windows. Is that what you will recommend next?

Apps like Gnote (which is a line for line port of TomBoy, missing many features of course, and it is easier to port than create anew) are created partially because of you. And Gnote is a total waste of time. You are also pushing people to stick with C and C++, and this is much worse for the free software community. If you had read my book, you would understand why we need to retire C and C++ as soon as possible.

And why not stand up against software patents? Let’s fight!

I sometimes feel that you should keep your opinions about free software focused only on the GPL and software license issues. Wading into programming languages and patent risks is much more complicated.

I’m on Twitter

Look for me @keithccurtis

I promise it won’t contain junk. In fact, I purge the twits from my twitter account.

Driverless cars

I believe we could have had robot-driven cars years ago. We need:

  • A video camera
  • A computer
  • Software

It should be obvious that we are missing only the software. What we need is 100s of people working together. There actually are enough computer vision people out there today, but they are not working together! They need codebases they can collaborate in, and big, worthy tasks.

My exciting insight is that instead of immediately hooking up computers to cars, and then spending months writing device drivers, I’ve come to the realization that we should start by hooking a vision engine up to a driving video game. We can cheaply and safely simulate all sorts of sensor inputs, fog, urban scenarios, etc. And anyone around the world can jump in and help out.

Torcs re-engineering w/ Mono

Right now the focus of JC Hoelt and I is taking the best free driving game, Torcs / Torcs-NG[1], improving the codebase to simplify and modernize it, and port it to Mono. Then it will be suitable for new people to improve the simulator and do vision and driving AI experiments. Unbeknownst to even its maintainers, Torcs hampered by its age and C/C++ code. It uses the ancient and primitive PLIB, and so we are making it use .Net wrappers around the modern graphics library Ogre. Its physics sometimes behaves funky, so we are making it use ODE & ODE.Net. In many ways big and small, Torcs is doing things the hard way: it has even written its own XML parser, which we can just throw away. It is no coincidence that the track format for Torcs has not changed in 10 years when the current codebase is such a complicated mess! For the current team, even little tasks like porting to the Mac is a major piece of work.

When we are done, we will have a 10x smaller and cleaner codebase that will serve as a great baseline to add features a lot faster.

Here is a link to our wiki.

We’re looking for people who want to help us out on any coding aspect. There are all kinds of easy and hard programming problems. If you’ve got a few hours a week of time, contact me! It is cool and interesting.

[1] Torcs-NG is a fork of Torcs that is making small changes (automake -> CMake, new artwork, Mac port, etc.) We can keep up with the changes made in both teams.

I’m gonna be rich!!

Man who will make me rich!!
This man needs my help to transfer 30 million dollars to a US bank account and I get 15%! I had no idea South Africa was such an unstable political environment that the banking system is not reliable, but their loss is my gain.

Note the passport looks a little suspicious to me as his personal details are done in a different font from the country name, and his personal picture looks like it was added on the computer later. I’m trying to get a bigger copy of the passport, a copy of his CV, to get him to send me any sort of inocuous mail from his official business account, and to understand why South Africa is unstable like Zimbabwe and Somalia, but I’m not worried, anyone with glasses like that has to be an honest man. He’s a respected civil servant in the Ministry of Finance of South Africa and that’s all I need to know. I’m going to be able to live like a rock star! I’ll keep this post up to date with the latest news.

Update: I think Khumalo Donald figured me out. Oh well, the dream was fun while it lasted.

Going to Lang.Net

This is the talk I will give gave:

E-mail I sent to Mark Shuttleworth

Hi;

I like Ubuntu very much, but I find it annoying how behind the curve you guys are with your releases. Jaunty is the first release of Ubuntu that has Mono debugging support out of the box. Jaunty will ship with Mono 2.0, which was released October 6, 2008, yet Mono 2.4 has just been released! I’m going to have to wait till October to get the Mono bits that were released in March.

Here is a list showing more examples of how Fedora is more up to date:
http://subbisays.blogspot.com/2009/03/ubuntu-904-vs-fedora-11-lot-can-change.html

In general, new software is better and more reliable than old software. You guys spend lots of time backporting fixes from newer builds that would be solved more efficiently by just taking newer builds! You guys also aren’t helping advance the state of the art by working on old software. Novell could care less about bugs in Mono 2.0.

If you guys ship every 6 months, using software that is 6 months old, did you really ship on day X, or 6 months ago and just sit on it?

I realize you guys have a tradeoff between stability and freshness, but I think your team is not making the right tradeoff, and I see this as a problem that crosses many teams. If there are any problems (that likely affect just a few customers), you can fix them right after release. What is the whole point of having this infrastructure of repos and backports?

-Keith