Duplication of effort is viewed as a sin in the Free and Open Source software (FOSS) development community. The whole ethos of FOSS dictates that developers should work together, sharing the improvements they make to software between them. For this reason forking, or starting a redundant project, is often viewed as an attack against the community.
The taboo against forking is well-documented. Eric Raymond (or esr) wrote about this is Homesteading the Noosphere specifically in a section from Promiscuous Theory, Puritan Practice. Here he is discussing three taboos of the FOSS culture, note the first one:
The taboos of a culture throw its norms into sharp relief. Therefore, it will be useful later on if we summarize some important ones here:
-
There is strong social pressure against forking projects. It does not happen except under plea of dire necessity, with much public self-justification, and requires a renaming.
I would like to make a specific case that starting a new project, depending on how it is done, can be as violent to the community as an unwelcome fork. I believe that new projects, without new advantages, are just as damaging to the community as forks are. If the community agrees with me on this, I will ask Eric to add “starting redundant projects” to his list of taboos. [section added 4-10-08]
Two separate projects, both doing essentially the same thing in the same way, fractures the community into two factions of programmers who can no longer work together. To fully understand this problem, one has to understand a little about how the Free and Open Source development community works. The first thing to understand is the licensing.
FOSS licensing allows anyone to download software sourcecode and modify it to their hearts content. While anyone can make modifications to FOSS licensed code, there is a community standard that there is one “official” person or group who controls the official version. For Linux this person is Linus Torvalds. There are many “versions” of Linux were code is developed and improved, but the community agrees that the version that everyone trusts is the version that Torvalds releases. Legitimate Linux development always aims to add features to Linux that will ultimately meet with Linus’ approval. But this is only because of community custom, the license does not require this at all. Eric entitled the chapter above Promiscuous Theory, Puritan Practice precisely because FOSS licenses allow forking while the FOSS community frowns on forking. There is nothing, other than cultural rules, stopping me from creating an operating system kernel based 100% on the code currently residing in the repository maintained by Torvalds, and call the project “Fredux”.
If I started “Fredux” either by forking the Linux codebase, or by starting a new kernel project from scratch and posted it to sourceforge, the Linux community would either mock me or ignore me. The reason is simple; I am not important in the operating system communities and Linux is very well established. However, it would not be difficult to really piss them off. My hypothetical “Fredux” project could easily anger the Linux community by doing any of the following.
- Post something to public forums, especially the Linux Kernel mailing list, in an attempt to attract developers from the Linux community over to Fredux.
- Pretend in public that my project is unique or innovative in way that distracted from Linux and its legitimate competitors.
- Get investment capital, or grant funding, at the expense of competing Linux-based projects to develop my Fredux operating system.
What do these actions have in common? They all poach resources that could have legitimately gone to the Linux community. If I started an operating system from scratch the community would react… poorly. The core issue would be that I would be acquiring resources that should have gone to Linux or another legitimate operating system project.
Why? What is wrong with starting a new operating system project? There is no problem with starting a new legitimate operating system, and then competing for resources for it. The problem comes when you start an illegitimate new operating system and try to pawn it off as a legitimate. Of course this begs the question: What makes a new project or a fork legitimate?
Here are some guidelines for when a new project might be in order. (contact me if you feel I have missed something)
- The project uses a different programming language, which has some advantage in the field of inquiry.
- The project addresses a serious feature gap in current projects.
- The project addresses a serious design limitation in current projects.
- The project uses a new programming paradigm that has advantages over those currently in use.
- The project uses a different development process that might have some advantages.
- The project uses a more common and accepted FOSS license than alternative projects.
So lets see how this applies to Fredux! Here is a hypothetical Fredux release announcement version .1
Fredux is a new operating system kernel available under Freds Very Open Sourcey License. Fredux is coded entirely in C which means that it compiles to blazing fast machine code. Fredux is designed to provide a kernel for a POSIX compliant operating system. Fredux consistently uses a modular approach to software design, allowing it to be extended easily. Fredux will use a distributed development model, ensuring that everyone can see the code and allowing the “many eyeballs” effect to take place. Most importantly, Fredux is completely Open Source and Free Software!
Ok, what are the problems with this announcement? First of all the “usual suspects” of FOSS operating systems (GNU/Linux, FreeBSD, OpenBSD, Hurd, etc) all use the programming language C. So there is no language benefit to Fredux. Linux and several other projects strive for POSIX compliance, so Fredux has not addressed a feature gap. Because Linux is modular there is no design advantage. Fredux is developed in the same way Linux is, ergo there is no development style advantage. Lastly, Fredux is NOT open source! You will find that Freds Very Open Sourcey license does not appear as an OSI approved license. You will also not find the license on the list of FSF approved licenses. So the license is neither “Free” nor “Open Source”. It is shame how many companies continue to market themselves as Open Source without actually stepping up and meeting the licensing requirements.
On the other hand lets look at the description of a legitimate operating system project.
Hurd is a collection of services that run on the Mach micro-kernel. The Hurd implementation is aggressively multi-threaded so that it runs efficiently on both single processors and symmetric multiprocessors. It is possible to develop and test new Hurd kernel components without rebooting the machine (not even accidentally). Running your own kernel components doesn’t interfere with other users, and so no special system privileges are required. Unlike other popular kernel software, the Hurd has an object-oriented structure that allows it to evolve without compromising its design.
This collection of sentences are taken verbatim from the Hurd main page. These sentences were re-arranged to make a point. The paragraph above details supposed advantages of using a micro-kernel design; Linux and other free operating systems usually use a macro-kernel design. The Hurd people think this is a design improvement on Linux. This description lists features (modify the kernel without rebooting) and architecture advantages. One can debate issues like macro-kernel vs. micro-kernel for days on end (and people do), what I am trying to clarify here is that because Linux is so well-established and capable, the Hurd project differentiates itself and clarifies why it is also a legitimate project right on the projects home page.
On the project home page for the OpenBSD project we find.
The OpenBSD project produces a FREE, multi-platform 4.4BSD-based UNIX-like operating system. Our efforts emphasize portability, standardization, correctness, proactive security and integrated cryptography. OpenBSD supports binary emulation of most programs from SVR4 (Solaris), FreeBSD, Linux, BSD/OS, SunOS and HP-UX.
Again taken verbatim from the openBSD home page. Again, note the emphasis on things that differentiate the project from the dominant Linux project. OpenBSD is famous for its focus on security, on the main page, it covers this advantage. It is also famous for supporting lots of architectures, again right on the main page. I hope that I am demonstrating, very explicitly, what it means to establish legitimacy in the context of a currently dominant project or projects. OpenBSD even goes so far as to list the projects and products that it competes with.
When multiple projects are legitimate it is reasonable for each project to compete for developers. Want to work with the most popular OS project? Choose Linux. What to work with a bunch of people obsessed with security? That would be OpenBSD. What to work on a micro-kernel? Work with Hurd.
But you should not work on Fredux. It has no advantages over any of these projects. The competition between legitimate projects is… well… legitimate. But the FOSS community will instantly attack anyone who pouches resources for a project that has no discernible advantage over an existing project.
In the medical FOSS community, we are particularly sensitive about it. We have a small community, and dividing it is a real problem. It happens too much already, for “legitimate” reasons. The holy grail in FOSS Health IT is the creation of a Electronic Health Record (EHR). We have several legitimate projects in this space. All of the legitimate projects are starving for resources like developers, documentation writers, clinical experts, funding sources and users.
What are the “legitimate” projects competing to build the best FOSS EHR? Sorry, but this post is already too long. That question merits a whole article on its own.
For now I want to talk about three projects in particular that together illustrate the problem of duplication of effort. Tolven, OpenMRS and ExampleProject [update 4-10-08 to protect the guilty, the name of the offending project has been removed]. Tolven and OpenMRS both have reputations for being mature, Java-based projects. Here is the description of Tolven.
Tolven EHR and PHR products are a suit of Java based technologies that use the latest in Open Source Java application toolkits to create a heavily Object Oriented, MVC EHR application. In short the Tolven strategy is to a very clean java-based EHR using solid OOP design with the latest tools.
Here is a description of OpenMRS.
OpenMRS is a community-developed, open-source, enterprise electronic medical record system framework intended to aid resource-constrained healthcare environments.
OpenMRS is currently implemented in Kenya, Rwanda, South Africa, Uganda, Tanzania, Zimbabwe, and Peru. Many of the developers are MIT trained and the design of OpenMRS is based on work done at the Regenstrief Institute. 2008 will mark its second year participating in the Google Summer of Code. The design of OpenMRS is carefully thought out to support a rural deployment model. OpenMRS is a very mature project. [added OpenMRS 4-10-08 to acknowledge the tremendous things the project has done]
A bit of downlow information about Tolven: At the time of the writing Tolven is going live in several installations. Tolven has a tremendously talented development team. Tolven has received substantial funding. They have an EHR and PHR already working.
On my short list of legitimate FOSS EHR systems (and believe me it is short), Tolven and OpenMRS are currently battling for the title of top Java-based project. To review; Tolven is funded, going live now, 12 full-time developers, working software = legitimate. OpenMRS is funded, deployed all-over-the-place, and backed by some of the top minds in Medical Informatics = legitimate.
ExampleProject is a Java-based EHR system using strict OOP principles and the latest Java software stack.
ExampleProject is in alpha. ExampleProject is primarily developed by one guy, who is working part-time. ExampleProject plans to have a release that can be used in a live environment by October of 2008 ( a little less than a year away from the writing) . ExampleProject is not legitimate.
In short, ExampleProject appears to be a Fredux. What is worse, ExampleProject is constantly put forward as a legitimate project on LinuxMedNews, EMRupdate and the openhealth mailing list. Those forums are intended to be open and so censoring ExampleProject news would be unethical. The problem is that it appears to the un-initiated as though ExampleProject is important, when in fact it is exactly the opposite, an illegitimate project. ExampleProject has attracted at least two other part-time developers, and seems to be succeeding in acquiring attention and resources that properly should go to Tolven, OpenMRS or some other project. ExampleProject is guilty of the sin of duplication of effort. Everything they are doing has already been done, and well, by another project in the same programming language. Every developer that works with ExampleProject instead of working with Tolven or OpenMRS, is accidentally wasting their time. The same is true of beta testers, investors, and others who might be interested in working with a project. Because Tolven and OpenMRS continue to improve, and a greater rate than ExampleProject, it is very unlikely that ExampleProject will ever catch up. While it is fine for the project manager of ExampleProject to waste his time, it is not OK for him to trick others in the community into wasting time with him.
I have discussed this issue with the ExampleProject project manager. Before I mentioned it to him, he had never heard of Tolven. After spending about ten minutes on the Tolven site he said that the big difference between Tolven’s efforts and his own was that ExampleProject was using Swing, where Tolven was using AJAX. This seems like it might be a real legitimate technical difference, except that Tolven would likely be very happy for someone from the community to step up and create a thick-client using Swing. They would probably say something like “We do not have time to work on both AJAX and Swing interfaces. But our software design should allow any ‘viewer’ type. If you feel Swing interfaces would be better, and were willing to write it yourself, we would be happy to modify our interfaces enough to make that possible.” Even if Tolven were unwilling, the great thing about open source is that you can prove your technical points without the permission of the project manager. If Tolven or OpenMRS were shown that a Swing interface was better than an AJAX interface, they would probably adopt Swing.
A great example of a project that has spawned a sub-project specifically to address differences of opinion regarding user interfaces is the Kubuntu project, which is Ubuntu, only with KDE instead of Gnome.
When a separate project is justified is a complex issue. In fact, I gave Neil Cowles of the Tolven project a stern lecture about why he should be working through the OSCAR project, which was the dominate Java based EHR project before Tolven came on the scene. Since that lecture Tolven has proven to me and the community generally that they are legitimate. It is possible that ExampleProject, despite its small beginning, will grow into a mature open source EHR. They could prove me wrong.
Right now I would guess that ExampleProject is about one year and a million dollars of development behind Tolven or OpenMRS. If you are a developer, clinician, investor, entrepreneur, or administrator you should probably work with Tolven or OpenMRS if you want a Java-based FOSS EHR. (I will update this article if this ever changes).
I hope I have accomplished two things at this point. I hope I have shown, through a hypothetical example and a real one, examples of when duplication of effort is a problem. I also hope that I have said enough about ExampleProject to steer those who might want to work with a Java-based EHR to Tolven or OpenMRS.
-FT
(updated to clarify content and remove offending project name in April 10, 2008 if you care, just go read an archived version to find out who ExampleProject was)
(updated for readability Feb 27, 2008)
I was a little surprised when you started talking about Tolven and ******* (removed 4-17), because the issue of forking codebases is something I’ve been looking at in the past week or so and I thought you might be talking about the ClearHealth codebase.
I’m part of a small IT shop serving medical practices, and it strikes me that the same criticism that you apply to ****** (removed 4-17) could easily be applied to ClearHealth/MirrorMed. The availability of code is nice, but the message that I’ve gotten from reading what’s available is “if you’re a practice, please use our software; if you’re supporting practices feel free to fork our codebase and slap your own name on it.”
Sharing code is fine, but over time that strikes me as a recipe to grow lots of slightly-different versions unless someone (with more available time than I have) decides to fork the base and open it up wider for development by people outside the current 2-3 companies doing the bulk of the work. I know OpenEMR isn’t as polished-looking, but it seems to me that it’s a much more inviting environment for new developers to get involved with.
Alan,
Sorry I missed your post. It got buried with a bunch of spam. You have excellent points about ClearHealth/MirrorMed. A few thoughts:
ClearHealth has a pretty strict trademark policy, as you have stated. I would like to do something more friendly with MirrorMed, but the concern would be trademark-dilution. If someone is really interested in using the MirrorMed trademark I will probably come up with a better was to handle the dilution issue.
As for the development style of the various communities: OpenEMR is a very approachable community, probably the best project in that particular regard. ClearHealth and MirrorMed should open up, but from a business perspective, both companies are more interested in ensuring good products for our own customers, than working with a community that generates only infrequent patches.
If you have a serious code-contribution that you feel is being overlooked, let me know and I will see what I can do.
-FT
Code gets forked for all kinds of reasons, political, technical, social, financial, managerial. It has been this way for a long time, as long as I have been able to keep track of (late 1980s).
Some people work better on their own private copy of a project so that they can change anything and everything and not have to be a part of a community. If you, as a community, build a system which has a license that allows this behavior, it’s bound to happen.