Planet FLOSS Research

More info at: http://libresoft.es.

Olivier Berger (GET/INT)Appel à commentaires sur ADMS.F/OSS 0.3 : vocabulaire pour les méta-données décrivant les logiciels dans les forges

Je reprends ici une news que je viens de publier sur : Un vocabulaire pour les méta-données décrivant les logiciels paru pour appel à commentaires

La période de appel à commentaires du vocabulaire de méta-données ADMS.F/OSS v0.3 vient de commencer.

ADMS.F/OSS est un vocabulaire de méta-données permettant de décrire des logiciels libres ou open-source (F/OSS), qui doit permettre d’explorer, trouver et tirer des liens facilement vers des logiciels sur le Web. Les spécifications visent à réutiliser au maximum des spécifications existantes telles que DOAP, ADMS, et le Trove software map. La version actuelle de ADMS.F/OSS (version 0.3) a été élaborée entre janvier et avril par un groupe de travail qui comptait 45 personnes issuesde 14 pays différents et est proposée aux commentaires du public jusqu’au 2 juin.

Pour plus de détails, merci de vous référer à l’annonce complète en anglais : Vocabulary for software metadata released for public review

Si vous développez ou administrez des forges ou des catalogues de logiciels, alors, ceci vous concerne peut-être.

Merci d’avance.

P.S.: en ce qui me concerne, mes commentaires sur ce draft sont sur : https://joinup.ec.europa.eu/asset/adms_foss/topic/public-comments-admsf/oss-v03#comment-11982

Karl BeecherWhy choose Python for teaching?

I recently read a tweet by a computer science educator claiming the superiority of a particular programming language for teaching purposes (Pascal, if you must know). Now, I don’t really go for religious wars — each to his own and all that — but I did reply with my opinion that Python might generally be… read more

Mirko BoehmDer “Tag des Geistigen Eigentums” beim BDI – die Nachlese aus der Sicht der FSFE

Der 26. April ist von der WIPO im Jahr 2000 zum “Welttag des Geistigen Eigentums” ausgerufen worden, um das Bewusstsein dafür zu schärfen, wie Patente, Urheberrechte, Warenzeichen und Designs unser tägliches Leben beeinflussen. Der Bundesverband der Deutschen Industrie e.V. (BDI) hat am 26. April 2012 zum “Tag des Geistigen Eigentums” ins Haus der Deutschen Wirtschaft eingeladen. Das Motto der Veranstaltung war diesmal “Geistiges Eigentum verpflichtet” – was in Anbetracht der Hitzigkeit der aktuellen Debatte um eine Reform des Urheberrechts spannende Diskussionen versprach. Insbesondere wäre die Betonung der Pflichten der Rechteinhaber eine willkommene Bereicherung. Eine Delegation der FSFE nahm an der Veranstaltung teil, und wartete lange darauf, dass es zur Sache ging.

Schaltgetriebe, Kettensägen, Kopfhörer

Denn bei den Beiträgen ging es zunächst mal um Schaltgetriebe, Kettensägen und Kopfhörer. In einer tatsächlich beeindruckenden Präsentation von VW über das Direktschaltgetriebe und seinen auf den entsprechenden Patenten in Verbindung mit VWs Lizensierungspolitik aufbauenden Erfolges standen zunächst traditionelle Ingenieurserfindungen und ihr Schutz durch Patente im Vordergrund. Aus Sicht von VW “verpflichtet geistiges Eigentum zum Schutz von Technologien und Innovationen, damit Hochlohnländer wie Deutschland gegen die Konkurrenz aus Niedriglohnländern bestehen können”.

Der Werkzeughersteller Stihl berichtete unter der Überschrift “Wie der Schutz geistigen Eigentums allen nützt” über die erschreckenden Ausmasse, die Produktpiraterie angenommen hat, und zog die Verbindung zwischen den Namen der Unternehmen und Produkte, die Konsumenten entsprechende Qualität erwarten lassen, und der Verbrauchersicherheit, die durch Nachbauten schlechterer Qualität nicht gewährleistet wird. Dabei wird vorausgesetzt, dass Nachahmer immer auch schlechtere Qualität mit weniger Verbrauchersicherheit liefern, was sicher im demonstrierten Einzelfall zutrifft, aber – wie die deutsche Solarindustrie gerade schmerzlich lernt – sicherlich nicht verallgemeinert werden kann.

In der Paneldiskussion reduzierte dann Volker Bartels von Sennheiser das Internet auf einen “Marktplatz für Produktpiraterie, mit Strukturen wie bei der Mafia, und Gewinnen wie im Drogenhandel”. Überraschend war der Hinweis von Uwe Wiesner, Leiter Patente, Marken und Lizenzen bei VW, dass China inzwischen in Sachen Durchsetzung von Patenten ein verlässlicher Partner sei. Später im Pausengespräch wurde darauf hingewiesen, dass circa 30 Prozent der Produktnachbauten inzwischen aus Deutschland kämen (eine Quelle war dafür nicht aufzutreiben). Die Hamburger Politologin Ingrid Schneider betonte, das ACTA sich als umfassende Alphabetisierungs- und Sensibilisierungskampagne für geistiges Eigentum und Schutzrechte herausgestellt hat. Durch das Internet kommen Bürger direkter und häufiger mit Schutzrechten in Berührung, und durch die ACTA-Debatte denken sie darüber bewusster nach, erkennen die Wichtigkeit und beziehen deutlicher Position. Dies begrüssenswerte Erkenntnis stand im Gegensatz zu früheren Kommentaren, die den ACTA-Gegnern indirekt mangelnen Sachverstand vorgeworfen hatten.

Linke Ecke Digitale Gesellschaft e.V., rechte Ecke Industrie

Auf den Punkt gebracht wurde die aktuelle Bruchstelle zwischen Bürgerinteressen und geistigem Eigentum beim Streitgespräch zwischen Günter Berg vom Hoffmann und Campe Verlag und Markus Beckedahl von Digitale Gesellschaft e.V.

Beckedahl wies erneut darauf hin, dass der Begriff Geistiges Eigentum an sich irreführend ist und aus dem Sprachgebrauch gestrichen gehört. Er stellte die aktuelle Laufzeit des Urheberrechts mit siebzig Jahren nach dem Tod des Autors in Frage, und brachte erneut Pauschalabgaben zugunsten von Urhebern ins Gespräch. Er erläuterte, warum mit dem Internet aufgewachsene Bürger dieses als öffentlichen Raum betrachten.

Berg dagegen sprach von Mythos des Internets als öffentlichem Raum, der nicht zuträfe, weil das Internet im wesentlichen von wenigen sehr starken Unternehmen wie Google und Facebook kontrolliert sei. Es müsse auf die Entwicklung eine Unrechtsbewusstseins bei Internetnutzern hingewirkt werden, wenn sie sich ohne Gegenleistung Dinge aneignen, so wie sie dies im realen Leben auch haben.

Es schien, als ob beide Seiten deswegen nicht zu einem gemeinsamen Standpunkt finden konnten, weil die Rolle des Internet diametral unterschiedlich gesehen wurde. Zum Beispiel ist der Handlungsspielraum der Politik, wenn es sich um einen Marktplatz handelt, wesentlich umfassender als bei einem öffentlichen Raum, in dem politische Grundrechte geltend gemacht werden können. Im letzteren sind zum Beispiel Zugangssperren auf Grund von wiederholten Urheberrechtsverletzungen undenkbar, im ersteren schon.

Immer um den heissen Brei

In seiner Begrüssungsrede stellte Markus Kerber vom BDI Schutzrechte als das Fundament des Exporterfolgs der deutschen Wirtschaft heraus. Immer wieder wurde ACTA als Antipiraterieabkommen bezeichnet, dessen Umsetzung doch im Interesse aller liegen müsste. Wiederholt wurden Markenverletzungen und Produktpiraterie als Gründe herangezogen, um die Notwendigkeit der Überwachung des Internets zu belegen. Geistiges Eigentum wurde als integraler Bestandteil einer freiheitlichen und marktwirtschaftlichen Gesellschaftsordnung hervorgehoben. Es ist verständlich, dass der BDI die Interessen der deutschen Industrie vehement vertritt, es erscheint aber wenig zuträglich für das kultivierte Führen der Debatte um Geistiges Eigentum im 21. Jahrhundert, solche Positionen relativ unreflektiert vorzutragen. Auch drängt sich der Eindruck auf, dass der BDI im wesentlichen die Interessen der etablierten deutschen Unternehmen vertritt. Jedenfalls waren Stimmen von Tech-Startups bei der Veranstaltung nicht präsent (genauso wenig wie die von Urhebern).

Max Stadler, parlamentarischer Staatssekretär im Bundesministerium der Justiz, erklärte, der Schutz des Eigentums sei ein Grundrecht, aus dem direkt der Schutz des geistigen Eigentums folgte. Es ist nun aber so: Der Schutz des Eigentums ist ein Grundrecht, dass zuvorderst nicht nur dem Schutz des Bürgers vor Dieben, sondern aus den Schutz des Bürgers vor dem Zugriff der Staatsmacht sicherstellt. Das Grundgesetz sichert die freie Entfaltung der Persönlichkeit und die Freiheit der Kunst. Geistiges Eigentum kommt erst ausserhalb des Grundgesetzes im Urheberrechtsgesetz vor. Das geistiges Eigentum im Interessenkonflikt zwischen Urheber und Gesellschaft modelliert werden muss, zeigt sich an den einschlägigen Einschränkungen – so sind Schutzrechte im allgemeinen nur zeitlich beschränkt gültig, was beim Besitz an realen Gütern selbstverständlich nicht der Fall ist. Die amerikanische Verfassung enthält eine Copyright Clause, die den Schutz von geistigem Eigentum nur zu bestimmten Zwecken und ebenfalls befristet zulässt. Eigentum und geistiges Eigentum sind eben nicht das gleiche, und das eine folgt aus dem anderen nicht direkt. Die leichtfertige Gleichsetzung von Eigentum an Realgütern und geistigem Eigentum ist einer der Kernkritikpunkte der FSF(E) an der bestehenden Rechtsordnung, und dieser lässt sich durch einen solchen Pauschalsatz nicht aus der Welt schaffen. Eher entsteht der Eindruck, dass es sich um den Versuch der Wegdefinition des Problems handelt.

“Geistiges Eigentum” als irreführender Begriff

Wem dafür bisher Gründe fehlten – die Veranstaltung des BDI zeigte deutlich, wie der Begriff des “Geistigen Eigentums” an sich irreführend sein kann. Produktpiraterie, also das Verkaufen von Nachahmungen, die dem Verbraucher vorgaukeln, Produkte eines namhaften Herstellers zu kaufen, ist ein Problem des Markenrechts, also entweder der unrechtmässigen Verwendung eines Namens oder einer sehr ähnlichen Verballhornung (in der Präsentation von Stihl wurde von Produkten gesprochen, die unter den Namen “Still” oder “Sthil” verkauft wurden). Das Recht einer Firma an seiner Marke ist vergleichbar mit dem Recht einer Person am eigenen Namen, entsteht automatisch und gilt potentiell ewig. Es hat mit Urheberrecht oder Patenten an sich gar nichts gemein, dient aber in der Argumentation immer wieder als Beweis, das Geistiges Eigentum ständig verletzt wird, und deswegen die Durchsetzung dessen forciert werden muss. Selbst wenn durch Produktpiraterie Patente verletzt werden, lässt sich noch keine Verbindung zur Anwendung des Urheberrechts auf das Internet aufbauen. Die Argumente der betroffenen Industrieunternehmem sind berechtigt und ihnen muss bei der Vertretung ihrer Rechte zur Seite gestanden werden. Es handelt sich hier aber um ein Problem der Durchsetzung der bestehenden Rechtsordnung, während dem Wehleiden der Verwertungsgesellschaften das Wegbrechen eines überholten Geschäftsmodells zu Grunde liegt.

Tragisch ist, dass diese falsche Zusammenfassung von artverschiedenen Sachverhalten in diesem Fall der Industrie Schaden zufügt: Die Forderung nach der Bekämpfung von Produktpiraterie ist allgemein nachvollziehbar, und hätte, da sie dem gesunden Menschenverstand entspricht, es sicherlich leicht eine politische Mehrheit zu finden. Dadurch das ACTA aber quasi huckepack mit Antipirateriemassnahmen auch Vorhaben zur Überwachung wegen Urheberrechtsverletzungen enthält, erregte das Gesamtpaket an Massnahmen erheblichen politischen Widerstand. Es ist im Interesse des BDI und der deutschen Industrie darauf hinzuwirken, diese Verquickung eines Gemischtwarenladens an Schutzrechten unter dem Namen Geistiges Eigentum aufzulösen, und die einzelnen daraus entstandenen Problemfelder – den Kampf gegen Produktpiraterie, die Reform des Urheberrechts, das europäische Patent, … – einzeln anzugehen. Eine solche Vorgehensweise würde der deutschen Industrie und dem BDI auch ermöglichen, die Debatten um den Schutz von realen Gütern einerseits und die Umsetzung des Urheberrechts auf Informationsgüter im Internet andererseits sachgerecht zu trennen.

Das Internet – Marktplatz oder öffentlicher Raum?

Es ist überraschend, dass die Frage nach dem Charakter des Internets als öffentlichem Raum immer noch diskutiert wird. Deswegen folgt hier noch einmal das Verständnis derjenigen, die mit dem Internet aufgewachsen sind: Das Internet ist der öffentliche Raum, in dem Beziehungen gepflegt (Privatsphäre), Informationen aufgenommen (Meinungsfreiheit) und bereitgestellt (Freiheit der Presse) werden, das eigene Gesamtbild gepflegt wird (freie Entfaltung der Persönlichkeit), in dem in Communities gemeinsame Ziele verfolgt werden (Versammlungs- und Vereinigungsfreiheit), Nachrichten versandt werden (Briefgeheimnis), … Eine abschliessende Aufzählung ist wohl nicht möglich, aber wer will unter dieser Sichtweise dem Internet die Eigenschaft des öffentlichen Raums absprechen? Damit wird auch deutlich, warum die Verweigerung des Zugangs zum Internet vom Bürger aufgenommen wird, als würde man erwägen, Berufsverbote wieder einzuführen. Für Netzbürger ist das Wegnehmen des Internetzugangs vergleichbar mit dem Hausarrest für Dissidenten.

Politik und Interessenvertreter wie der BDI sind in diesem Zusammenhang gefordert, und offensichtlich teilweise überfordert, ihr eigenes Verstandnis vom mündigen Bürger auf den neuesten Stand zu bringen. So erlaubt das Internet neue Formen der partizipativen Demokratie, da es tatsächlich möglich macht, jeden einzelnen Bürger nach der Meinung zu einem Thema zu befragen. Insofern liegt Markus Kerber vom BDI falsch, wenn er postuliert, dass das einzige Thema der Piratenpartei die Umdeutung geistigen Eigentums als Kollektivgut ist. Es ist das Gespenst der aktiven Teilhabe des mündigen Bürgers an transparenten Entscheidungsprozessen, die das Internet möglich macht und von der Netzbürger wissen, dass sie möglich ist. Politik und institutionelle Interessenvertreter wie der BDI empfinden diese zumindest als unangenehme Veränderung. Deswegen sollte es aber nicht verwundern, wenn protektionistische Massnahmen wie ACTA heute grosse Teile der interessierten Öffentlichkeit zum Protest aktivieren, während man diese vor einigen Jahren noch gemütlich zwischen politischen Ausschüssen und Interessenvertretern verkungeln konnte. Der BDI sollte sich dafür einsetzen, alle Sektoren der deutschen Industrie in eine rational geführte Debatte einzubeziehen, und Lösungen zu unterstützen, die nicht sofort wieder auf Grund von Einschränkung von Freiheitsrechten auf den Prüfstand gebracht werden. Eine zukunftsweisende, langfristig stabile, den gesellschaftlichen, wirtschaftlichen und individuellen Interessen gerecht werdende Regelung des Urheberrechts ist eine wichtige Grundlage für das langfristige Wachstum der deutschen Wirtschaft. Der BDI kann hierzu eine führende Rolle übernehmen, denn “Geistiges Eigentum verpflichtet”.

@mirkoboehm • Mirko on LinkedIn • @AgileWorkers

David A. WheelerThe magic cookie parable

In some presentations I include the “magic cookie parable”. Here is the parable, for those who have not heard it (I usually hold a cookie in my hand when I present it). Anyway…

I have in my hand… a magic cookie! Just one cookie will supply all your food needs for a whole year. What is more, the first one is only $1. Imagine how much money you will save! Imagine how much time you will save!

Ah, but there’s a catch. Once you eat the magic cookie, you can only eat magic cookies, as all other food will become poisonous to you. What’s more, there is only one manufacturer of magic cookies.

Do you think the cookie will be $1 next year? How about for the rest of your life? Are you as eager to eat the cookie?

Is that a silly parable? It should be. Yet many people accept information technology (IT), for themselves or on behalf of their organizations, that are fundamentally magic cookies. Too many are blinded into accepting technology that makes them, or their organization, completely at the mercy of a single supplier. You can call dependence on single supplier a security problem, or a supply chain problem, or a support problem, or many other things. But no matter what you call it, it is a serious problem.

Now please do not hear what I am not saying. I am not here to attack any particular supplier. In fact, we all need suppliers, and I am grateful for suppliers! The problem is not the existence of suppliers; the problem is excessive dependency on any one supplier.

There are only a few information technology (IT) strategies that counter sole-supplier dependency that I know of:

  1. Build and control it yourself. In a few cases this is reasonable, but in most cases, that is too expensive and it risks obsolescence.
  2. Open systems/open standards. Here, you ensure that your system is made of modular parts with key interfaces covered by standards; that way, you can later switch to a different product. This can work, but suppliers may create proprietary extensions that (if you are not careful) lock you in anyway.
  3. Open source software. Since open source software allows anyone to modify and redistribute the software, if a supplier goes in a direction you did not like, you can band together with other customers to ensure a supply of software that meets your needs.
  4. A Combination. That is, a combination of the above.

Before getting locked into a single supplier, count the true cost over the entire time it will occur. Sure, in some cases, it may be worth it anyway. But you may find that this true cost is far higher than you are willing to pay. (The cookie image is by Bob Smith, released under the CC Attribution 2.5 license. Thank you!)

Israel Herraiz (UAX, Spain)The impact of bias in bug-fix datasets for defects prediction

Last week I gave a talk at UC Davis about the research work I will be doing during these months. It contains some preliminary results about the impact of bias in bug-fix datasets.

In projects with bug tracking systems and version control repositories, when a commit corresponds to a bug fix, it is usually marked accordingly (for instance, with a message like "Fixes bug #123"). This information can be used to recover the relation between commits and bugs, which is useful for defects prediction. The preliminary results I have obtained so far, show that the impact of bias is negligible for defects prediction if the model is based on a binary classifier (that is, only predicts whether an entity will contain or not defects, not how many defects it will contain). However, it is true that a non-biased dataset can provide a better accuracy, but just because, by definition, non-biased datasets contain more data. If we reduce the size of a non-biased dataset, by extracting a random sub-sample, it is as good as a biased dataset of the same size. Well, at least for the two cases I have studied so far.

More details in the slides. You can also see the slides at Slideshare.net, and get a PDF copy.

Martin F Krafft (Univ Limerick)Mouse on Mars

The atmosphere in Munich’s Backstage Werk just before the opening act to the Mouse on Mars was very chilled. People sat on the stairs or scattered themselves over the dance floor while low-fi ambient tunes came from the speakers. It wasn’t loud, you had to try hard to hear the people mumble.

I have no idea who the opening act was, and their first tune was very nice and groovy. Then ensued a noise explosion, one could only pity the electronic equipment that was being asked to perform in ways that may be described as “everything else than you expect”, and of course, the base beat shook the building; I am quite sure they didn’t use treble at all, but I may also simply have been unable to hear it. Plus, it seemed to us that the musicians catered for what may be a widespread decrease of attention span: it was noticable how they jumped from one thing to the next, not leaving them (or their listeners) any time to get in the groove.

My brother and I went outside for a bit and talked about today’s music and its simplicity. We postulated the repetitiveness as the basis of a mass movement, considered “scene” clubs that played heavy techno to an audience that is so entirely different to who historically frequented such musical performances, and in general tried to avoid assuming a position between simplifying society and accepting that individual freedom is as eclectic as can be.

When MoM opened, they continued pretty much in line with their openers and half way through the first tune, I started to wonder how long I would last, or when it would be reasonable to step outside again. I had been a little afraid this would happen, having bought and listened to their latest album Parastrophics in preparation of the concert and not being able to get into it.

However, what then followed blew us away. Still heavy, still all over the place, but now they were developing sound scenes, ripping them apart, having fun playing with and teasing the audience, while putting on a groove that inevitably made your muscles twitch with the beat.

David Bowie called MoM “the next big thing” and I have to give it to them: MoM have always had a certain aura of “that’s what your music is like? we can improve on that!” to them, and yesterday, they continued along those lines with astounding consistency, and it felt fresh.

It also felt real. They weren’t just pushing buttons and computers making music, they were making music and the computers were their instruments. Between the two founding members of MoM sat Dodo Nkishi, drummer and microphone artist, and if you don’t believe, fast, big breakbeat can be performed live, well, you’re wrong.

Most everyone in the room was dancing. And while I was more swaying in awe, watching and wondering how the heck they are doing what they are doing, I couldn’t contain the bouncing any longer. They came back for an encore and there was no more stopping the crowd, Thomas or me.

Three tracks later, they waved goodbye and left, but a bunch of us simply continued to dance. Thomas questioned who would last longer and I started yelling loudly for another encore. The lights turned on, I considered it a slap in the face, but I did not stop yelling. Others tuned in. And then the lights went off and the band came back.

Following their 2.5 hour show, gosh was I exhausted. It was a magnificent show. If you aren’t afraid of big beat electronica and you take pleasure in nonstandard art, I heartily recommend you ensure that MoM aren’t soon playing near you without you there.

PS: MoM will play at the (Düsseldorf Open-Source Festival)[http://www.open-source-festival.de/en/] on 30 June 2012!

PPS: Now I listen to Parastrophics and I am really enjoying it.

NP: Mouse on Mars: Parastrophics

David A. WheelerDoD Open Source Software (OSS) Pages Moved

The US Department of Defense (DoD) has changed the URLs for some of its information on Open Source Software (OSS). Unfortunately, there are currently no redirects, and that makes them hard to find (sigh). Here are new links, if you want them.

A good place to start is the Department of Defense (DoD) Free Open Source Software (FOSS) Community of Interest page, hosted by the DoD Chief Information Officer (CIO).

From that page, you can reach:

If you are interested in the topic of DoD and OSS, you might also be interested in the Military Open Source Software (Mil-OSS) group, which is not a government organization, but is an active community.

David A. WheelerInsecure open source software libraries?

The news is abuzz about a new report, “The Unfortunate Reality of Insecure Libraries” (by Aspect Security, in partnership with Sonatype). Some news articles about it, like Open source code libraries seen as rife with vulnerabilities (Network World) make it sound like open source software (OSS) is especially bad. (To be fair, they do not literally say that, but many readers might infer it.)

However, if you look at the report, you see something quite different. The report directly states that, “This paper is not a critique of open source libraries, and we caution against interpreting this analysis as such.” They only examined open source Java libraries, but their “experience in evaluating the security of hundreds of custom applications indicates that the findings are likely to apply to closed-source and commercial libraries as well.”

This is a valuable report, because it points out a general problem not specific to OSS.

The problem is that software libraries (OSS or not) are not being adequately managed, leading to a vast number of vulnerabilities. For example, the report states that “The data show that most organizations do not appear to have a strong process in place for ensuring that the libraries they rely upon are up-to-date and free from known vulnerabilities.” They point out that “development teams readily acknowledge, often with some level of embarrassment, that they make no efforts to keep their libraries up-to-date.” They also note that “Organizations download many old versions of libraries… If people were updating their libraries, we would have expected the popularity of older libraries to drop to zero within the first two years. However, the data clearly show popularity extending back over six years…. The continuing popularity of libraries for extended months suggests that incremental releases of legacy applications are not being updated to use the latest versions of libraries but are continuing to use older versions.” They recommend that software development organizations inventory, analyze, control, and monitor their libraries, and give details on each point.

I should note that I’ve been saying some of these things for years. For years I have said that you should evaluate OSS before you use it… some software is better than others. Back in 2008 I also urged developers to use system libraries, at least as an option; embedding libraries often leads over time to the use of old (and vulnerable) libraries. An advantage of OSS is that many people can review the software, find problems (including vulnerabilities), and fix them… but this advantage is lost if the fixed versions are not used! And of course, if you develop software, you need to learn how to develop secure software. As the report notes, tools can be useful (I give away flawfinder), but tools cannot replace human knowledge and human review.

For more information, you should see their actual report, “The Unfortunate Reality of Insecure Libraries” (by Aspect Security).

Israel Herraiz (UAX, Spain)Visiting UC Davis

Since a couple of days ago, I am in Davis, California, for a 4-months visit to UC Davis, hosted by Prof. Prem Devanbu. This visit is possible thanks to a "José Castillejo" grant awarded by the Spanish Ministry of Education and Science.

The main goal of this visit is to work on finding an automated method to evaluate the bias in bugs datasets. This bias is introduced when the bug-fix reports are linked with commits in the version control system. When a developer accepts and/or fix a bug report, she decides and accordingly marks the report with a severity level. In Bugzilla, one of the most used bug tracking systems, a developer can mark severity using a seven levels scale. In a previous paper (PDF available), I have shown that not all developers use the same criteria to select the severity, and it should be enough with only three levels. This difference in the developers criteria to mark and classify bug reports is one of the sources of bias in the bug-fix datasets (PDF of the paper available). Another source of bias is developer confidence; not all developers mark commits or bug reports with commit ids when they are starting in a project, because they are afraid of exposing themselves. However, those commits do correspond to bug fixes, and should be accounted for in a bug-fix dataset.

This bias disease affects the Eclipse Bug Data from Software Engineering Chair at Saarland University, which is one of the main data sources used for empirical software engineering. As an example, a paper studying the distribution of software bugs which was based on that Eclipse data has generated a response which has found other better distribution fits, and that does not reuse the same dataset but gathers the data directly from the original sources.

Clearly, reusing datasets for empirical software engineering is a good idea, which fosters reproducibility and verifiability, essential properties of any empirical research discipline. However, if we can not assure the quality of the reusable datasets, reusable datasets can cause more harm than benefits.

My goal with this visit is to apply statistical methods to evaluate the bias in a bug-fix dataset. The two papers about the distribution of bugs in Eclipse are an example of the kind of work I want to do. If we can be sure of the quality and lack of bias of a dataset, carefully built to act as a "canonical" dataset, we can compare other datasets against that canonical dataset, to find out if there is any bias. The two papers about Eclipse mentioned above show that the distribution of bugs can vary in the presence of bias. The first paper used a biased dataset, and the second paper repeated the data gathering process from scratch, avoiding the use of the biased dataset. Although it can also be due to methodological differences, they found different distributions for software bugs.

So my goal is to measure this difference in the distribution using a statistical technique, to detect the presence of bias, and develop a statistical test to find bias in reusable datasets. I am assuming here that the distribution cannot change due to other factors (and we already know that there are other sources of bias in bug reports), and that the shape of the distribution is unique. The second assumption is quite fair, but the first assumption is more complicated, and it will require to find more than one dataset that is known to be unbiased.

I hope this work will provide a tool to assess the quality of a bug-fix dataset, and to avoid the problems of bias, which are a threat to the validity of all the empirical studies using these bug-fix datasets.

Karl BeecherPartnering with Agile Workers

More exciting news. Mirko Böhm, co-founder of Agile Workers Software, has invited me to become a partner in the company. I’ve gladly accepted. Agile Workers is a cool, new Berlin-based start-up that offers a diverse array of software services. (By the way, Berlin really is becoming the place in Europe for tech start-ups. Read all about… read more

Carlo Daffara (Conecta.it)A new EveryDesk is out!

We were particularly happy about out work on EveryDesk – a portable, fully working live Linux installation on a USB disk. But we found out that more and more people were looking for more space, a more modern environment, and in general to refresh things. We have been busy with out other pet project – CloudWeavers, a private cloud toolkit, and we redesigned EveryDesk to be the ideal client environment for companies and administrations that are moving totally or partially to a private or public cloud. We took several ideas from ChromeOS, but frankly speaking the hardware support was extremely limited, and even with exceptional ports like Hexxeh’s “Lime” the user experience is still less than optimal. We have basically redesigned everything – the base operating system is now derived from OpenSuse (mainly thanks to the excellent package management tool, that drastically increases the probability that the system would continue to work after an update – a welcome change from Ubuntu), we integrate Gnome 3, the latest Firefox and Chromium on a BTRFS install that supports compression and error concealment, so it works properly even on low-cost USB devices. On an 8Gb USB key, you get 4Gb free, and all the apps at your disposal, ready to go.

The only major change in hardware support is the fact that EveryDesk is now a 64-bit only operating system, but we believe that despite the limitation it can still be useful at large. It integrates some components that are maybe less interesting for individual use – for example the XtreemFS file system, that can be used to turn individual PCs into scale-out storage servers in a totally transparent way, and with great performance, or many virtualization enhancements. On the user side, we already installed some of our favorite additions among fonts, software, and tools; Firefox uses by default the exceptional Pdf.js embedded viewer, that uses no separate plugins and is faster than Adobe Acrobat, and there is the usual assortment of media codecs and ancillary little things.

We love every moment that we work on this project, and I would like to thank the many people that helped us, sent criticisms and praises. One wrote “I can’t believe  how well it works, without time lags I normally associate with running on a CD or a thumb” and I can’t thank our users enough – they are our real value. As usual, you can download EveryDesk from Sourceforge.

01 PM 05 PM 13 PM 14 PM 53 PM

David A. WheelerSoftware patents may silence little girl

Software patents are hurting the world, but the damage they do is often hard to explain and see.

But Dana Nieder’s post “Goliath v. David, AAC style” has put a face on the invisible scourge of software patents. As she puts it, a software patent has put her “daughter’s voice on the line. Literally. My daughter, Maya, will turn four in May and she can’t speak.” After many tries, the parents found a solution: A simple iPad application called “Speak for Yourself” that implements “augmentative and alternative communication” (AAC). Dana Nieder said, “My kid is learning how to ‘talk.’ It’s breathtaking.”

But now Speak for Yourself is being sued by a big company, Semantic Compaction Systems and Prentke Romich Company (SCS/PRC), who claims that the smaller Speak for Yourself is infringing SCS/PRC’s patents. If SCS/PRC wins their case, the likely outcome is that these small apps will completely disappear, eliminating the voice of countless children. The reason is simple: Money. SCS/PRC can make $9,000 by selling their one of their devices, so they have every incentive to eliminate software applications that cost only a few hundred dollars. Maya cannot even use the $9,000 device, and even if she could, it would be an incredible hardship on a Bronx family with income from a single 6th grade math teacher. In short, if SCS/PRC wins, they will take away the voice of this little girl, who is not yet even four, as well as countless others.

I took a quick look at the complaint, Semantic Compaction Systems, Inc. and Prentke Romich Company, v. Speak for Yourself LLC; Renee Collender, an individual; and Heidi Lostracco, an individual, and it is horrifying at several levels. Point 16 says that the key “invention” is this misleadingly complicated paragraph: “A dynamic keyboard includes a plurality of keys, each with an associated symbol, which are dynamically redefinable to provide access to higher level keyboards. Based on sequenced symbols of keys sequentially activated, certain dynamic categories and subcategories can be accessed and keys corresponding thereto dynamically redefined. Dynamically redefined keys can include embellished symbols and/or newly displayed symbols. These dynamically redefined keys can then provide the user with the ability to easily access both core and fringe vocabulary words in a speech synthesis system.”

Strip away the gobbledygook, and this is a patent for using pictures as menus and sub-menus. This is breathtakingly obvious, and was obvious long before this was patented. Indeed, it would have been obvious to most non-computer people. But this is the problem with many software patents; once software patents were allowed (for many years they were not, and they are still not allowed in many countries), it’s hard to figure out where to end.

One slight hope is that there is finally some effort to curb the worst abuses of the patent system. The Supreme Court decided on March 20, 2012, in Mayo v. Prometheus, that a patent must do more than simply state some law of nature and add the words “apply it.” This was a unanimous decision by the U.S. Supreme Court, remarkable and unusual in itself. You would think this would be obvious, but believe it or not, the lower court actually thought this was fine. We’ve gone through years where just about anything can be patented. By allowing software patents and business patents, the patent and trade office has become swamped with patent applications, often for obvious or already-implemented ideas. Other countries do not allow such abuse, by simply not allowing these kinds of patents in the first place, giving them time to review the rest. See my discussion about software patents for more.

My hope is that these patents are struck down, so that this 3-year-old girl will be allowed to keep her voice. Even better, let’s strike down all the software patents; that would give voice to millions.

Olivier Berger (GET/INT)How to manage and export bibliographic notes/refs in org-mode

I’ve felt the need to manage my bibliography with org-mode, allowing me to write drafts of papers while being able to keep a track of all the litterature I’ve read and published already.

There are already many resources which explain how to integrate org-mode with reftex for instance, in order to cite papers inside org-mode, or how to link to biblographic references in bibtex format using org-bibtex.

People have also posted hints on how to manage bibliographic notes inside an org-mode file, which would allow to keep a track of read papers, tag them, add comments, and link these notes to the bibtex file contents.

But I couldn’t find a single comprehensive resource explaining if/how to manage links to such bibliographic notes that can both be navigated inside org-mode, and be exported to latex for previewing article drafts.

Here’s a proposal in attempt to bind all these needs together.

Let’s say we have one bibtex file ~/org/bibliography.bib which contains all the papers references.

We’ll also add into ~/org/bibliography.org all the notes relating to these articles. These notes will be identified by CUSTOM_ID properties which will contain the bibliographic reference of the papers.

Then we can create a draft in ~/org/draft.org which takes advantage of these.

We can then use two new link prefixes, bib and note to create links to entries in the bibtex file and the corresponding bibliographic notes. These are based on the use of a special rtcite link, that will be handled by a bit of emacs lisp.

Provided that some code is added in the .emacs to treat link opening and latex export for these rtcite links, we now have a valid solution :

  • clicking on a note:abibref link in an org-mode document will jump to the corresponding bibliographic note about a particular paper ‘abibref’ (a section in ~/org/bibliography.org which has a :CUSTOM_ID: abibref property).
  • clicking on bib:abibref link in an org-mode document will jump to the corresponding bibliographic reference in the bibtex file.
  • exporting an org document containing either of the above links to LaTeX will produce correct references cite{abibref} LaTeX code (see the results here : draft.pdf).

Details of the bibliographic notes contents (~/org/bibliography.org):

#+LINK: bib rtcite:bibliography.bib::%s

#+LINK: note rtcite:bibliography.org::#%s

#+title: My bibliographic notes

# \bibliography{bibliography}

* My papers

** 2005

*** Why and how to contribute to libre software when you integrate them into an in-house application ?

:PROPERTIES:
:CUSTOM_ID: bac05why
:END:

[[bib:bac05why][BibTeX]] .

/This is an interesting paper.../

See also [[note:berger06integration]]

In the above, note that note: links use rtcite links with a # character, which will allow jumping to the CUSTOM_ID property.

Details of a paper draft (~/org/draft.org) :

#+LINK: note rtcite:~/org/bibliography.org::#%s

#+LINK: bib rtcite:~/org/bibliography.bib::%s

#+title: How to mix org and bib for fun and profit
#+author: Olivier Berger

# \bibliography{bibliography}

* Read a lot

See [[note:bac05why][Why and how to contribute to libre software]] or [[bib:berger06integration]] .

#+BIBLIOGRAPHY: bibliography plain limit:t

Excerpts of the corresponding .emacs :

(defun my-rtcite-export-handler (path desc format)
  (message "my-rtcite-export-handler is called : path = %s, desc = %s, format = %s" path desc format)
  (let* ((search (when (string-match "::#?\\(.+\\)\\'" path)
                   (match-string 1 path)))
         (path (substring path 0 (match-beginning 0))))
    (cond ((eq format 'latex)
           (if (or (not desc)
                   (equal 0 (search "rtcite:" desc)))
               (format "\\cite{%s}" search)
             (format "\\cite[%s]{%s}" desc search))))))

(require 'org)

(org-add-link-type "rtcite"
                   'org-bibtex-open
                   'my-rtcite-export-handler)

The above is an adapted version of a proposal sent to the org-mode list by Nick Dokos in a response to Andreas Willig : http://lists.gnu.org/archive/html/emacs-orgmode/2012-02/msg00640.html

Diomidis Spinellis (Athens Univ)How do Big US Firms Use Open Source Software?

We hear a lot about the adoption of open source software, but when I was asked to provide hard evidence there was little I could find. In an article I recently published in the Journal of Systems and Software together with my colleague Vaggelis Giannikas we tried to fill this gap by examining the type of software the US Fortune 1000 companies use in their web-facing operations. The results were not what I was expecting.

David A. WheelerIntroduction to the autotools (autoconf, automake, libtool)

I’ve recently posted a video titled “Introduction to the autotools (autoconf, automake, and libtool)”. If you develop software, you might find this video useful. So, here’s a little background on it, for those who are interested.

The “autotools” are a set of programs for software developers that include at least autoconf, automake, and libtool. The autotools make it easier to create or distribute source code that (1) portably and automatically builds, (2) follows common build conventions (such as DESTDIR), and (3) provides automated dependency generation if you’re using C or C++. They’re primarily intended for Unix-like systems, but they can be used to build programs for Microsoft Windows too.

The autotools are not the only way to create source code releases that are easily built and packaged. Common and reasonable alternatives, depending on your circumstances, include Cmake, Apache Ant, and Apache Maven. But the autotools are one of the most widely-used such tools, especially for programs that use C or C++ (though they’re not limited to that). Even if you choose to not use them for projects you control, if you are a software developer, you are likely to encounter the autotools in programs you use or might want to modify.

Years ago, the autotools were hard for developers to use and they had lousy documentation. The autotools have significantly improved over the years. Unfortunately, there’s a lot of really obsolete documentation, along with a lot of obsolete complaints about autotools, and it’s a little hard to get started with them (in part due to all this obsolete documentation).

So, I have created a little video introduction at http://www.dwheeler.com/autotools that I hope will give people a hand. You can also view the video via YouTube (I had to split it into parts) as Introduction to the autotools, part 1, Introduction to the autotools, part 2, and Introduction to the autotools, part 3.

The entire video was created using free/libre / open source software (FLOSS) tools. I am releasing it in the royalty-free webm video format, under the Creative Commons CC-BY-SA license. I am posting it to my personal site using the HTML5 video tag, which should make it easy to use. Firefox and Chrome users can see it immediately; IE9 users can see it once they install a free webm driver. I tried to make sure that the audio was more than loud enough to hear, the terminal text was large enough to read, and that the quality of both is high; a video that cannot be seen or heard is rediculous.

This video tutorial emphasizes how to use the various autotools pieces together, instead of treating them as independent components, since that’s how most people will want to use them. I used a combination of slides (with some animations) and the command line to help make it clear. I even walk through some examples, showing how to do some things step by step (including using git with the autotools). This tutorial gives simple quoting rules that will prevent lots of mistakes, explains how to correctly create the “m4” subdirectory (which is recommended but not fully explained in many places), and discusses why and how to use a non-recursive make. It is merely an introduction, but hopefully it will be enough to help people get started if they want to use the autotools.

Mirko Boehm“Managing Trust” in mixed commercial and volunteer Open Source communities

After an extensive collection of feedback, “Managing Trust” was chosen as the keynote theme at Cebit this year. Most commentators relate that to building up trust in the commercial offerings of software and hardware companies. Making Free Software, the KDE community does not have this kind of problem – it enjoys a high level of trust and goodwill in its solutions by end users. The fact that what our software does is verifiable by looking at the code, further improves this impression. Instead, potential points of conflict lie within the community, where volunteers and companies are working on a common product based on their own sets of potentially conflicting interests. This is the article to the presentation with the same title that was given at Cebit’s Open Source Forum. The accompanying slides can be found here.

It is well established that Free Software can be put to commercial use, and already is to a large extent. Communities consisting of a mix of volunteer and employed contributors are probably the norm. So it may come as a surprise to some (especially to companies trying to take part in a community driven project) that the collaboration is not always smooth and harmonious. The conflicts mostly arise out of diverging sets of interests, but sometimes harm to the community can also be the result of well-intended actions by companies.

Potential points of conflict

The crowding out of volunteers is such an issue. It happens especially in well-established communities that include successful companies. With the numbers of employed contributors rising, volunteers increasingly develop the impression that it is impossible for them to keep up with the project pace, and drop out. This has happened in the KDEPIM community for example and luckily was noticed by the involved companies. Those then took active measures to reduce the problem by increasing transparency and encouraging volunteer contributions, for example by financing regular developer sprints. We are not always that lucky.

Attribution for commercial use is another such potential conflict. Generally, it threatens the existence of a Free Software project if one part of the community mainly pursues commercial goals, while others contribute as volunteers and do not benefit from it. The famous Mambo/Joomla fork occurred over such disputes. Attribution agreements and similar arrangements solve this problem from a legal perspective, but within the community the issue remains. The Qt Project for example has a rather good standing as a community that produces free and commercial versions of the same product. This requires building trust over an extended period of time, a trust that can easily be shattered by disruptive changes like the Nokia technology switch away to Windows Phone. As in real life, trust is hard to gain and easy to lose.

Another particularly tricky issue is the viability of the community over time when the project is successfully growing its following. Companies tend to grow proportionally with the community, and naturally prefer to hire developers that already know the technology and have a proven track record of working in a diverse, chaotic environment. If this process of companies ingesting volunteers happens faster than the community is able to attract “fresh blood” from new contributors, the resulting lack of volunteers hinders the project’s progress. A secondary aspect is that companies will naturally try to catch the biggest fish in the pond, hiring away the best developers first, who then work there for an extended period of time. This can create an impression that valuable, reliable contributions come from those companies, whereas the volunteers are mostly inexperienced newcomers. It would be easy to claim that companies should not hire from the community extensively, but that ignores one of the major motivations of volunteers to take part – especially if these are students, it is a common goal to build a reputation as a Free Software contributor to later find an interesting job more easily.

This leads nicely into the requirement to retain embedded knowledge in the community. It is natural for communities to experience churn where contributors leave after a while (maybe because they graduated, or through a change in life), and new ones join the project (maybe because they started studying, or they retired and now have more time at hand). People that leave take their experience with them, leaving the young folks to make the same mistakes again. KDE e.V. offers volunteers the option to stick around as passive members, which allows them to contribute their opinion on such topics they know well. Companies exist independently of individual employees, and usually have more stable teams with processes of knowledge management in place. A particular danger manifests itself over time that a team of corporate employees can represent more substantial embedded knowledge about the project than any volunteer. Companies can counter this effect through extensive openness and transparency, but it does require a conscience effort.

The first thing that usually comes to mind when thinking about the activities of companies in communities are patent and trademark issues. Of the described systematic problems, this seems to be a minor one, as long as the community did its homework, and has defined a policy of licensing and trademarking. This underlines the importance of those annoying license headers in the code, and of the non-technical contributions to the project that handle the legal aspects.

The constitution of KDE

KDE is a mixed volunteer and commercial community. While many see it almost exclusively as a non-commercial Free Software project, this impression is not necessarily correct. A mature ecosystem (bingo!) of companies is active in and around KDE, and only recently new spin-offs like ownCloud, MakePlayLive or Agile Workers Software have surfaced. The question is how well KDE is set up to manage the conflicts described above, and where it could use some improvement.

The KDE community is very much a meritocracy, with merit being earned through stable contributions over an extended period of time. The clout of a contributor is not easily transferred to another, so that companies will have a hard time finding a replacement for an employee that leaves. The development community is also very self-directed, with no central group that decides on technical issues. Some opinions are heard more than others, but that happens on a voluntary basis as well, pointing back to meritocracy. KDE employs the “All doors open” approach, where new contributors quickly get commit access to all parts of the software project. This way of inviting as many people to contribute as possible is highly important to the self-conception of the community, and signifies how KDE generally strives to be open and inviting to potential contributors. At the level of the individual contributor, there is the general “Who does the work decides” rule which avoids a potential specialization in architects and implementors. The contributing author is free to choose the style and means of the implementation of a feature, without having to ask a “higher authority” (which does not exist) for directions. The main corrective to quality problems is that due to network effects, a sub-par implementation is very likely to be replaced with a better one by another developer. It is important to note that this rule provides a check to the meritocratic way, as it limits the influence of the “oldies” over the activity of new, more active project members. Where companies compete for revenue in markets, Free Software contributors compete for keeping their contributions in the code. This process is very competitive because of the openness to a large number of reviewers and contributors.

At the heart of the KDE community sits KDE e.V., the stakeholder’s association of the contributors. “e.V.” means “eingetragener Verein” or incorporated society, in this case a not-for-profit one. KDE contributors are invited to join based on their merit, which partially aligns the membership of KDE e.V. with the highest ranking contributors in the meritocracy. Only individuals are accepted as members with a vote, no entities (organizations). The invitation needs to be extended by an existing member of KDE e.V., but that barrier is not a steep one. Somebody with an observable history of contributions over a few months will generally be accepted. To keep contributors around, KDE e.V. offers the model of passive membership as described earlier. Entities (like companies, but also educational institutions or public bodies) can only become “Supporting Members” at the price of an annual donation. This buys them a shiny badge for their web site, but not even a single vote in decisions taken by the membership. The rationale for this setup lies in the fear that corporate members would take over and dominate the organization if left with a vote. KDE e.V. is designed to protect the rule of the community over the KDE project’s outcome, and at the time it was set up, community was considered to be individuals, not entities. Such protective measures are deemed necessary since KDE e.V. is, for example, the holder of various trademarks like the one for KDE itself, and also to the copyright of much of the code due to fiduciary license agreements.

Criteria of fair collaboration

While the responsibilities of contributors are often spelled out clearly in project manifestos or similar pamphlets, it is not very common for Free Software projects to explicitly state the rights contributors can claim when being a part of the community. KDE’s otherwise excellent Code of Conduct contains many Dos and Don’ts, but not a bill of rights. Herein lies a reason for some of the difficulties for conflict resolution, since violations of rights can seldom be traced back to positive actions mentioned in the code of conduct, as compared to forbearance of support or help. If the requests of a contributor are ignored, nobody violated the code of conduct, because nobody did anything.

Contributor Rights

A (probably incomplete) list of rights for a community member could be as follows:

  • A contributor has the right to influence development and the other productive processes in the community. This more technical aspect includes access to the proceedings of working groups, and a transparent decision making process about practical issues. This rule essentially guarantees the same opportunity to gain merit in the meritocracy to all. KDE does very well in this regard.
  • A contributor has the right to be part of the community and collaborate. This aspect focuses more on organizational and social issues. It means, for example, when voicing an opinion, one is heard and given a corresponding response rather than being ignored. Another example would be that invitations to sprints, development meetings or conferences should be handled as openly as possible, so as to not exclude anybody from attending. Installing such a rule would mean that the rather common back room politics become unacceptable according to the project’s own standards.
  • In a Free Software project, contributor’s rights should include exercising the Four Freedoms postulated by the FSF (or a similar definition of freedom, according to the communities preference): run the product, study and change it, redistribute it and redistribute modified versions. What needs to be made explicit is under which circumstances redistribution is acceptable. For example, it should be very clear if commercial versions based on the original product are wanted. Just the fact that such a basic rule is documented in the contributor’s rights alone should already reduce the amount of friction caused by commercial spin-offs.
  • A contributor should have the right to benefit from her or his own contributions. This means that authors have the right to be named (which is the case in most projects), but not based on the requirements of some arcane copyright law, but on consideration of the necessary attribution for the contributor. The corollary to this is that a contributor should not overly benefit from the work of others. So a company should not claim that it developed KDE practically all alone because it contributed to it, as this would diminish other people’s contributions.

The list of contributor’s rights so far was likely not that controversial. But what about other rights? Especially those companies would like to attain when joining the project? For example, companies may want to attribute contributions of others to themselves so that they can produce a commercial version of the software, parallel to the Free Software release. While such a setup is unacceptable in the KDE context, it is common in the Qt Project, but also in research consortia or with single-vendor Open Source products (like Zarafa). Such a right may conflict with the right of the individual contributors to benefit from their own work (as described above). A Free Software community is well advised to clearly state whether or not such attribution is acceptable, and to enforce these rules. A well defined license policy already resolves much of the uncertainty.

A similar problem is that of building IP assets like trademarks and copyright based on the product of a Free Software project. Building up intangible assets is an important way to increase the value of a company, so corporate contributors naturally have an interest in pursuing building up IP based on the project. It is strongly expected that a Free Software community will refuse such requests, since the central goal is most often to build the software equivalent of a commons; KDE certainly would. Companies should not try to appropriate the IP of Free Software projects, as that will likely lead to a dysfunctional community or major conflict.

Contributor Responsibilities

With rights come responsibilities, and vice versa. The KDE code of conduct lists the following: be considerate, be respectful, be collaborative, be pragmatic, support others in the community, and get support from others in the community. This is a good start, but here are a few more:

  • Be open about your goals. Make them known, and act according to them. It is quite understandable that different entities have differing sets of motivations. Honesty is better in all cases – it is better to say “We need this fixed, because our product requires it” than to say “We have only the best for the community in mind”.
  • Be transparent about activities. A community thrives on collaboration. Forking a project secretly and later dumping a “much improved” version back to the community is paramount to bad manners. When paying for development on certain functionality, make it known. If making KDE work with some 3rd-party software is the goal, there is no reason to hide that.
  • Be committed longer-term. Contributing something to a project creates the burden of maintaining the piece for later releases. It is only prudent to make sure that this maintenance does happen. Also, most non-trivial problems or improvements are hard to fully understand and solve in the best possible way with little insight into the project. It is better to start fixing simple issues, and working up to the harder problems, learning enough about the project in the process.
  • Be generous. Contribute because there is a common interest with the community, and with no strings attached. Support the community where possible. Share your knowledge and experience at least within the community. Take part in the whole process, and maybe fix things where possible even if those are not directly necessary for you.
  • Be humble. A community effort is not a one man or one entity show. A few contributions do not mean the contributor 0wnz the project. Volunteers fall into this trap just as easily as companies, overstretching the bragging rights earned with their work. The equivalent to individuals bragging is companies marketing that they practically “built the whole thing by themselves”. Such behaviour will damage the community spirit of working together.

Decision Making

Notably absent from the bullets above are distinctions between volunteer and commercial contributors. While some aspects are more applicable to one or the other, the described behaviour is expected from all members of the community alike. But if the responsibilities are the same, why should the rights differ?

Decisions in groups are based on votes. Explicit or not, whenever a group gets to a decision, it weighted the opinions of each group member in a certain way, and then came to a conclusion. A common and popular principle is “One contributor, one vote”. KDE does not fully apply this principle, since it does not give supporting members any kind of say.

The way meritocracy works can be described as “One commit, one vote”. It is commonly accepted that this is a good base for technical decisions, but not for project governance, since it tends to overly represent the opinions of developers.

Projects that heavily depend on one or a few major financial backers often apply the “One euro, one vote” scheme. Seats in advisory boards may be sold to major sponsors, or the weight of votes in the assemblies may be based on the platinum, gold or silver status of a member. It is obvious that in such a setup, volunteers are playing a less important role in the community.

Effectively these schemes define how valuable a specific contribution is. Cash is also a contribution, in some communities it buys more than a bug fix, in others it does not. All of the schemes described above are applied in various Free Software projects, usually following the goals set by the founders of the project (which sometimes are companies, as in the case of Mozilla). An ethical verdict on these setups is impossible, since the motivation of a company to produce something in the public domain is not more or less noble than that of an individual.

Learn and Adapt

It is due to KDE’s and KDE e.V.’s constitution that despite several shake-ups in the industry, KDE remains alive, innovative and independent. The protection mechanisms built into the KDE e.V. bylaws have proven to be effective. With this in mind, it would be convenient to simply carry on with business as usual.

KDE is one of very few prototypical Free Software communities with little external influence. It is driven by volunteers, and not slowed down by influencers that consider change to be a barrier to sales. It is this independence that allowed the project that drastic re-write that became KDE 4, even though the initial release left users hurting. Again, since it works well, is there a need for change?

Apparently, in recent years, this independence has also become our biggest weakness as a community. Some of the influence by more market- and user-focused companies is normally beneficial to a software product. The central question is how much influence KDE is willing to grant third parties by becoming part of the community. The list of contributor’s rights discussed in this essay can serve as a start for the discussion. Applying it consequently would at least require to apply the “One member, one vote” principle, essentially raising supporting members to full active member status. Other or additional steps are possible. In the essence, we need to show respect to the community, while at the same time catering to the special wishes of commercial contributors.

At the moment, KDE applies an extremely open policy to newcomer volunteer contributors, while it erects barriers like mandatory membership fees without any tangible influence for entities that want to become part of the community. Since companies are community, this situation does not reflect the open spirit of the community that KDE otherwise represents. It is important to include all stakeholders in the community, and that includes companies with KDE related products and services.

The KDE ON program that is currently in development is intended to build a network of such companies to become a part of the KDE community, and to improve collaboration between the volunteers and them. It effectively redefines the idea of the KDE community to be more inclusive, just as the KDE community redefined itself to be a collection of related projects instead of a single desktop project. At Cebit, KDE has approached a first group of companies, raising awareness of the common interest and inviting them to join the game. The response was surprisingly positive, which leads us to believe that it is possible to implement this program.

Thanks for reading.

@mirkoboehm • Mirko on LinkedIn • @AgileWorkers

OSSMole Project (Syracuse & Elon)Student work using FLOSSmole data

I often have my students tackle FLOSSmole data as a way of learning more about FLOSS, databases, data visualization, etc.

Here is an example of one of the graphs my students worked on last week, using Freecode data in FLOSSmole, R, and Illustrator.

January 2012 Freecode data set is available here.

Diomidis Spinellis (Athens Univ)Package Management Systems

DLL hell was a condition that often afflicted unfortunate users of old Microsoft Windows versions. Under it, the installation of one program would render others unusable due to incompatibilities between dynamically linked libraries. Suffering users would have to carefully juggle their conflicting DLLs to find a stable configuration. Similar problems distress any administrator manually installing software that depends on incompatible versions of other helper modules.

OSSMole Project (Syracuse & Elon)February Github data released

February data has been released for Github.

Get the data here from our Google Code downloads page or request direct database access here.

Included with Github data are the following values:
project name
developer name
description
private yes/no
fork number
homepage
number of watchers
open issues
...and all the xml values that these fields are based on!

Have fun!

OSSMole Project (Syracuse & Elon)February Google Code data released

Google Code data has been released for January/February 2012.

Get the data here from our Google Code downloads page or request direct database access here.

Be aware that there is one open bug for Google Code collection that may affect your use of this data.

Included in the Google Code run this time is: project info, developer list for each project (names obfuscated in some cases), blog info, labels, links, groups, etc etc. Have fun!!

David A. WheelerDebian GNU/Linux = $19 billion

Debian developer James Bromberger recently posted the interesting ”Debian Wheezy: US$19 Billion. Your price… FREE!”, where he explains why the newest Debian distribution (“Wheezy”) would have taken $19 billion U.S. dollars to develop if it had been developed as proprietary software. This post was picked up in the news article ”Perth coder finds new Debian ‘worth’ $18 billion” (by Liam Tung, IT News, February 14, 2012).

You can view this as an update of my More than a Gigabuck: Estimating GNU/Linux’s Size, since it uses my approach and even uses my tool sloccount. Anyone who says “open source software can’t scale to large systems” clearly isn’t paying attention.

Diomidis Spinellis (Athens Univ)How to Decrypt "Secrets for Android" Files

Secrets for Android is a nifty Android application that allows you to securely store passwords and other sensitive data on your Android phone. Your data are encoded with your supplied password using strong cryptography and are therefore protected if your phone gets stolen. Although the application offers a backup and an export facility, I found both wanting in terms of the availability and confidentiality associated with their use.

Karl BeecherMore trainings available via Mixin

I was recently added to the staff of trainers at Mixin, a German-based training company run by Dr. Björn Kesper. My entry into the company means they now offer Java and Eclipse trainings in addition to a whole host of others, including C#, .NET, SQL Server, HTML, CSS and whole lot more. Follow the link… read more

Martin F Krafft (Univ Limerick)Stop ACTA

I hope by now you have heard of ACTA. In any case, here is a nice 6:30 minute video giving a good overview.

Please help stop ACTA. Our freedom is at risk. Whether you tell people about it, write about it, use services like Twitter to tell the world about #StopACTA, or whether you take the time to march against what corporate entities are lobbying politicians to do against their people — please help protect the Internet as we know it.

NP: God is an Astronaut: Moment of Stillness

David A. WheelerNew Hampshire: Open source, open standards, open data

The U.S. state of New Hampshire just passed act HB418 (2012), which requires state agencies to consider open source software, promotes the use of open data formats, and requires the commissioner of information technology (IT) to develop an open government data policy. Slashdot has a posted discussion about it. This looks really great, and it looks like a bill that other states might want to emulate. My congrats go to Seth Cohn (the primary author) and the many others who made this happen. In this post I’ll walk through some of its key points on open source software, open standards for data formats, and open government data.

First, here’s what it says about open source software (OSS): “For all software acquisitions, each state agency… shall… Consider whether proprietary or open source software offers the most cost effective software solution for the agency, based on consideration of all associated acquisition, support, maintenance, and training costs…”. Notice that this law does not mandate that the state government must always use OSS. Instead, it simply requires government agencies to consider OSS. You’d think this would be useless, but you’d be wrong. Fairly considering OSS is still remarkably hard to do in many government agencies, so having a law or regulation clearly declare this is very valuable. Yes, closed-minded people can claim they “considered” OSS and paper over their biases, but laws like this make it easier for OSS to get a fair hearing. The law defines “open source software” (OSS) in a way consistent with its usual technical definition, indeed, this law’s definition looks a lot like the free software definition. That’s a good thing; the impact of laws and regulations is often controlled by their definitions, so having good definitions (like this one for OSS) is really important. Here’s the New Hampshire definition of OSS, which I think is a good one:

  1. ”Unrestricted use of the software for any purpose;
  2. Unrestricted access to the respective source code;
  3. Exhaustive inspection of the working mechanisms of the software;
  4. Use of the internal mechanisms and arbitrary portions of the software, to adapt them to the needs of the user;
  5. Freedom to make and distribute copies of the software; and
  6. Modification of the software and freedom to distribute modifications of the new resulting software, under the same license as the original software.”

The material on open standards for data says, “The commissioner shall assist state agencies in the purchase or creation of data processing devices or systems that comply with open standards for the accessing, storing, or transferring of data…” The definition is interesting, too; it defines an “open standard” as a specification “for the encoding and transfer of computer data” that meets a long list of requirements, including that it is “Is free for all to implement and use in perpetuity, with no royalty or fee” and that it “Has no restrictions on the use of data stored in the format”. The list is actually much longer; it’s clear that the authors were trying to counter common vendor tricks who try to create “open” standards that really aren’t. I think it would have been great if they had adopted the more stringent Digistan definition of open standard, but this is still a great step forward.

Finally, it talks about open government data, e.g., it requires that “The commissioner shall develop a statewide information policy based on the following principles of open government data”. This may be one of the most important parts of the bill, because it establishes these as the open data principles:

  1. ”Complete. All public data is made available, unless subject to valid privacy, security, or privilege limitations.
  2. Primary. Data is collected at the source, with the highest possible level of granularity, rather than in aggregate or modified forms.
  3. Timely. Data is made available as quickly as necessary to preserve the value of the data.
  4. Accessible. Data is available to the widest range of users for the widest range of purposes.
  5. Machine processable. Data is reasonably structured to allow automated processing.
  6. Nondiscriminatory. Data is available to anyone, with no requirement of registration.
  7. Nonproprietary. Data is available in a format over which no entity has exclusive control, with the exception of national or international published standards.
  8. License-free. Data is not subject to any copyright, patent, trademark, or trade secret regulation. Reasonable privacy, security, and privilege restrictions may be allowed.”

The official motto of the U.S. state of New Hampshire is “Live Free or Die”. Looks like they truly do mean to live free.

Olivier Berger (GET/INT)ADMS.F/OSS : standardizing meta-data for software description in forges or software catalogues

Maybe this could be of interest to a few of my readers who may have missed the announcement, in particular for ones related to forges which will be deployed for private administrations in Europe.

The recently stared ADMS.F/OSS project is described as :

ADMS.F/OSS is an XML and RDF vocabulary to describe software, in
particular free and open-source software (F/OSS), making it possible to
more easily search and discover software. The ADMS.F/OSS specification is still under development.

It is developped in the frame of an EC (European Community) programme for interoperability between public administrations (more on the page above).

I’ll try and participate to the working group, bringing in some feedback from the efforts on similar issues conducted during the (now over) COCLICO project.

Hope this helps.

Karl BeecherAnd, WordPress hackers, just in case you’re interested…

… here’s how I made the earlier slightly tricky change. Using this tutorial page I changed the front page of the site into a static page, which sat at “computerfloss.com/blog”. But I wanted the front page to be at the root of the domain, so I skipped over to this other tutorial page. This one… read more

Karl BeecherComputer Floss – Slight URL change. Update your feed!

I’ve added a new static front page to the computerfloss.com site. (I have some plans to expand here, you see.) The “/blog” part of the URL is no longer needed, so the old RSS address won’t work. Either click the RSS icon or change your feed address to: http://computerfloss.com/feed/rss/   read more

OSSMole Project (Syracuse & Elon)January 2012 releases

We're cruising ahead with January 2012 releases. Grab the data from Google Code site or from the teragrid.

Freecode - done (formerly known as Freshmeat)
Savannah - done
Tigris - done
Rubyforge - done
Objectweb - done
Launchpad - done

Google Code - still running
Alioth - bug submitted #54
Gihub - will start as soon as Google is done

Free Software Foundation - bug still not fixed (this is my fault) #51

Interesting things: most popular data from November ..... drumroll please.... Google Code, Github.

David A. WheelerStop SOPA and PIPA

Please protest the proposed STOP (Stop Online Piracy Act) and PIPA (PROTECT IP Act). The English Wikipedia is blacked out today, and many other websites (like Google) are trying to awareness of these hideous proposed laws. The EFF has more information about PIPA and SOPA. Yes, the U.S. House has temporarily suspended its work, but that is just temporary; it needs to be clear that such egregious laws must never be accepted.

Wikimedia Foundation board member Kat Walsh puts it very well: “We [the Wikimedia Foundation and its project participants] depend on a legal infrastructure that makes it possible for us to operate. And we depend on a legal infrastructure that also allows other sites to host user-contributed material, both information and expression. For the most part, Wikimedia projects are organizing and summarizing and collecting the world’s knowledge. We’re putting it in context, and showing people how to make sense of it. But that knowledge has to be published somewhere for anyone to find and use it. Where it can be censored without due process, it hurts the speaker, the public, and Wikimedia. Where you can only speak if you have sufficient resources to fight legal challenges, or, if your views are pre-approved by someone who does, the same narrow set of ideas already popular will continue to be all anyone has meaningful access to.”

Footnotes