Something I have been thinking alot about lately is the issue of Software as a Service and how that model works with the network effect and open source software.

My thinking is prompted by a service that I am thinking of launching. The code behind the service is very simple, and while I have predilection to release everything I do under FOSS licenses, I am thinking of not releasing the code for this. Notice that I am not talking about making a proprietary software product, that would be unethical, I am talking about offering a service over the Internet, using code that is kept private. Private code is ethical, proprietary code is not. It is a matter of control, proprietary software allows a user to run software that they have no control over. Private software running a network service is often called the ASP loophole of freedom respecting software licenses like the GPL (but not the AGPL), but basically it is ethical because the user is not actually running the software at all, they are just accepting a benefit from that software. The moral issue gets convoluted when you have a service that maintains user data on the foreign site, rather than just providing a take-it-or-leave it service. Google, for instance, is in a very different position of responsibility when it chooses to offer an email service rather than a search service. If Google stopped providing search, that would suck, but if gmail went down and took years of my corresponence with it... that would -really- suck.

For certain kinds of critical data, I think it is unethical even to use private code. This should seem especially obvious for health information.

Before we get to my issue, I wanted to point out another organization that is in essentially the same position: StackOverFlow.com

StakOverflow is a site that supports the ability to ask very specific technical questions and then rank the answers that result. You see if StackOverflow releases its code open source, then you could have hundreds of separate question-answering sites start, all of which would have have only trivial amounts of users on each site: Joel (as in Joel on Software) discusses this issue in a podcast (transcript):

Spolsky:

As long as Stackoverflow is in -one- place, then all of its users go to one place to ask and answer questions. There is a network effect of all of those users going to to same location, it means more questions and more answers. More questions and more answers means better answers and questions since the whole point of the stack overflow architecure is that "more" becomes "better" through user voting. Better answers means that more people will go there to search, which means more users, which means more answers/questions which means better answers which means better users and you have the loop. The critical upward spiral of community collaboration where the more users you have the more valuable the central resource is.

What does this sound like? It sounds like open source software development and the way Wikipedia works. In fact there is a whole book about this upward spiral-through-open-collaboration effect called wikinomics.

But the upward spiral of the -content- on StackOverflow is hindred by attempting to open source the code. The code would obviously improve if it where open sourced, but the content would degrade. (Aside: It might be possible to find a way to turn the StackOverflow model into a protocol too, so that you could have multiple instances that would create a large disstributed system of StackOverflow instances. So that when you searched for bird watching on StackOverflow.com you might get results from BirdOverflow.com or whatever. This is what Google is trying to do with Google Wave)

It should be noted that StackOverflow actually already open sources the content that it produces, using a creative commons license for the questions and answers posted there. They also provide a data dump of the content, so that you can get it for programmatic use without bothering to screen scrape. So they really are making an open source contribution.

Back to my idea. I have a service that I will be launching soon that will also greatly benifit from the network effect on the content, but would be damaged by having multiple instances. I am inclined to not release the source code for this reason, but I have not yet made up my mind...

Update:

This got several good comments very quickly. Thanks for that, I really have not made up my mind on this issue and your comments have been very helpful.

Probably the most important information that I got is that there are several Open Source Stack Overflow clones in various stages of development.

www.cnprog.com a Chinese Stack Overflow clone, written in python/django sourcecode here, GPLv3
shapado.com a rails implementation of Stack Overflow, written in Rails and using MongoDB, sourcecode AGPL
Plurk Solace a python implementation of Stack Overflow sourcecode available under the BSD license
The award for closest-to-original name and architecture goes to Stacked sourcecode, which is in .NET and GPLv3

I had searched for Open Source implementation of Stack Overflow and had only found Stacked. Personally reimplementing something so that it will not be proprietary anymore and then using a proprietary language (no offense to mono) to do it in just seems pointless. Of course I really wish there was something in php, since that is my current

crutch

language of choice. Hopefully people looking for a GPL or BSD implementation of Stack Overflow might be able to find it now. Drop a comment if you have a good implementation in php!!

-FT

Network Effect vs Open Source