Fred Trotter

Healthcare Data Journalist

Uncategorized

Network Effect vs Open Source

Something I have been thinking alot about lately is the issue of Software as a Service and how that model works with the network effect and open source software.

My thinking is prompted by a service that I am thinking of launching. The code behind the service is very simple, and while I have predilection to release everything I do under FOSS licenses, I am thinking of not releasing the code for this. Notice that I am not talking about making a proprietary software product, that would be unethical, I am talking about offering a service over the Internet, using code that is kept private. Private code is ethical, proprietary code is not. It is a matter of control, proprietary software allows a user to run software that they have no control over. Private software running a network service is often called the ASP loophole of freedom respecting software licenses like the GPL (but not the AGPL), but basically it is ethical because the user is not actually running the software at all, they are just accepting a benefit from that software. The moral issue gets convoluted when you have a service that maintains user data on the foreign site, rather than just providing a take-it-or-leave it service. Google, for instance, is in a very different position of responsibility when it chooses to offer an email service rather than a search service. If Google stopped providing search, that would suck, but if gmail went down and took years of my corresponence with it… that would -really- suck.

For certain kinds of critical data, I think it is unethical even to use private code. This should seem especially obvious for health information.

Before we get to my issue, I wanted to point out another organization that is in essentially the same position: StackOverFlow.com

StakOverflow is a site that supports the ability to ask very specific technical questions and then rank the answers that result. You see if StackOverflow releases its code open source, then you could have hundreds of separate question-answering sites start, all of which would have have only trivial amounts of users on each site: Joel (as in Joel on Software) discusses this issue in a podcast (transcript):

Spolsky: Well, but they will suck away some the audience that might have come to us, thus reducing the network effect, and thus reducing the value to the entire community.

As long as Stackoverflow is in -one- place, then all of its users go to one place to ask and answer questions. There is a network effect of all of those users going to to same location, it means more questions and more answers. More questions and more answers means better answers and questions since the whole point of the stack overflow architecure is that “more” becomes “better” through user voting. Better answers means that more people will go there to search, which means more users, which means more answers/questions which means better answers which means better users and you have the loop. The critical upward spiral of community collaboration where the more users you have the more valuable the central resource is.

What does this sound like? It sounds like open source software development and the way Wikipedia works. In fact there is a whole book about this upward spiral-through-open-collaboration effect called wikinomics.

But the upward spiral of the -content- on StackOverflow is hindred by attempting to open source the code. The code would obviously improve if it where open sourced, but the content would degrade. (Aside: It might be possible to find a way to turn the StackOverflow model into a protocol too, so that you could have multiple instances that would create a large disstributed system of StackOverflow instances. So that when you searched for bird watching on StackOverflow.com you might get results from BirdOverflow.com or whatever. This is what Google is trying to do with Google Wave)

It should be noted that StackOverflow actually already open sources the content that it produces, using a creative commons license for the questions and answers posted there. They also provide a data dump of the content, so that you can get it for programmatic use without bothering to screen scrape. So they really are making an open source contribution.

Back to my idea. I have a service that I will be launching soon that will also greatly benifit from the network effect on the content, but would be damaged by having multiple instances. I am inclined to not release the source code for this reason, but I have not yet made up my mind…

Update:

This got several good comments very quickly. Thanks for that, I really have not made up my mind on this issue and your comments have been very helpful.

Probably the most important information that I got is that there are several Open Source Stack Overflow clones in various stages of development.

I had searched for Open Source implementation of Stack Overflow and had only found Stacked. Personally reimplementing something so that it will not be proprietary anymore and then using a proprietary language (no offense to mono) to do it in just seems pointless. Of course I really wish there was something in php, since that is my current crutch language of choice. Hopefully people looking for a GPL or BSD implementation of Stack Overflow might be able to find it now. Drop a comment if you have a good implementation in php!!

-FT

8 thoughts on “Network Effect vs Open Source

  1. Lots of contradictions here — beyond the ones you admit. Let me point out just a couple.

    “It should be noted that StackOverflow actually already open sources the content that it produces, using a creative commons license for the questions and answers posted there. They also provide a data dump of the content, so that you can get it for programmatic use without bothering to screen scrape. So they really are making an open source contribution.”

    Not all creative commons licenses are free (or even “open source”) and in particular the one they chose is not. So Stackoverflow’s monopoly is maintained not only by keeping their server side code private. Actually, I just looked and they switched to CC-Wiki this summer which I guess it is free and open source. Nevermind that.

    Regardless, pleading “network effect” as a defense for keeping software proprietary seems unfounded given Wikipedia’s success and Stackoverflow’s failure. I haven’t found many signs that Stackoverflow actually exhibits a network effect. This isn’t just because the software is private. The topics it covers is narrow and I don’t think the “long tail” will ever kick in. The collaboration (read “competition”) model is overly individualistic (merit points) and therefore there is little cleanup or organization of past answers.

  2. “You see if StackOverflow releases its code open source, then you could have hundreds of separate question-answering sites start, all of which would have have only trivial amounts of users on each site”

    not quite. if so releases its code open source, they won’t be able to monetize it as StackExchange (http://stackexchange.com/).

  3. Hard to argue that StackOverflow is a failure. I will update my post with links to some of the SO clones that I have found, but none of them come close to StackOverflows presence on the web.
    The network effect has a temporal component, the fact that StackOverflow is ahead will probably mean that it will stay ahead forever, no matter the license of the underlying code. The design of SO is genius and once you get the “formula” right it is very easy to imitate as site like this.

    Wikipedia has had great success, but its also in the context of something that naturally has collaborative gravity, an encyclopedia. I am not sure that all collaboration efforts can pull off a wiki software release and an content project at the same time.

    You are right that SO has a narrow focus, but then that is part of its usefullness and initial draw. Technical questions are uniquely well suited to the model. They typically have exact answers that can be answered completely correctly. If you have a SO type site on parenting, for instance, it would be much more difficult to ensure that the best answers were always voted up. SO is very smart to apply its merit model in an area that it will obviously work in first!!

    -FT

  4. You are quite correct, the reason for not releasing it b/c of their monetization efforts. However, -I- am more concerned with the network effect… b/c that is what applies to my project…

  5. Very interesting post given our conversations about our SaaS and its multi-tenancy architecture a few months ago.

    Further review of the field has failed to support your claims that equivalent technology exists in open source.

    I find your argument about “private” software not being unethical to be more than a little soft. Private is not open source, regardless of who “runs” the executables. The key issue here is who owns the data – as you well know.

    On the other hand, I do agree with previous comments you have made to me about the need for certain institutions – universities, large hospital systems, governments to have access to source code very compelling.

    I would appreciate your thoughts on future pathways before I am completely overwhelmed by input from “open source” attorneys; so please email me when you have a moment.

  6. You know the ethical answer but perhaps are semi-blinded by a very normal sense of self interest and human desire for control a beautiful new idea or service. In favor of the benevolent dictator approach is the Linux Kernel example, against it and taken to the extreme, your logic is basically the same as the one which makes Microsoft a virtual monopoly: A monarchy is more “efficient” than a democracy (but efficient at providing what the dictator decides, not what the community wants).

    The whole beauty of open source, is that it humanizes and empowers software developers by placing the financial value upon ongoing sweat equity and support of openly shared information & ideas, not on closely held proprietary ideas, concepts, and actual code valuable primarily for their artificial scarcity. The latter creates ‘gated communities’ where only a select few are invited to be ‘in the know’.

    You were on the correct track when you pointed out the alternative: Open sourcing the code, then allowing networks of networks to form.

    Even if you open source your code, if you provide the best implementation of your idea, in the fairest, most efficient, least expensive way, your reputation as a FOSS advocate will go a long way towards making your network software service the preeminent portal for users. If not, users will gravitate towards others who do it better, and the world benefits.

    This would result in the same or superior end value to users, with the primary loss being YOUR control, which is why I made the statements I did in the first paragraph. That is the open source model as I understand it.

  7. I’m most interested in the project you are working on. Keep it private or open source it. Doesn’t bother me either way since I’m sure you’ll do what’s in the best interest of the community. I’m just more interested in what you are releasing.

Comments are closed.