Contribution vs Credit vs Authorship for software

While preparing for a workshop that the URSSI project is hosting on software credit today, I started thinking again about a recent blog by Titus Brown, “Revisiting authorship, and JOSS software publications.” Titus says, “fundamentally, in order to nurture a diverse array of valuable scientific contributions, we need new models of publication with new models of authorship,” a statement with which I strongly agree, and in fact, part of the reason for the upcoming URSSI workshop.

He also said “If [someone] can pinpoint the contribution made by an individual, andit was a positive contribution, as opposed to an extractive one, that person is a contributor. … And contributors deserve to be offered authorship.” My immediate thought was that this made sense, and I said so in a comment on Titus’s blog.

However, I then read Luis Pedro Coelho’s response to Titus’s blog, “Thoughts on ‘Revisiting authorship, and JOSS software publications,’” which made me realize my first reaction was a bit naïve, as happens to all of us sometime.  Luis writes “My position is against every positive contribution deserves authorship. Some positive contributions are significant enough that they deserve authorship, others acknowledgements, others can even go unmentioned.” And, following the lines of an old joke, I agree with this too.

As both Titus and Luis reminded me, contribution and credit and authorship are really three different things. Thinking about software projects, or really about almost any collaborative activity, we want to first record positive contributors and their contributions, then provide appropriate credit for those contributors. Finally, in the case of academia, we decide that some of these contributions pass a bar and qualify the contributor as an author.

(Note that in academia, we have changed the word “author” to no longer mean a person who writes but is instead, simply a level of contribution. It would be amusing to read a newspaper article about a person who merely suggested an idea to a writer about the plot of a future novel later request that they should be made an author of the finished book, while if that book was a bestseller, it wouldn’t be surprising for the first person to sue the author for some of the profits.)

Thinking about the three concepts one at a time, a contribution that should be recognized is, as Titus and Luis both agree, any positive contribution to the software project, whether in code or not, whether in the repository or not, etc. It’s not too hard to record contributions by adding the contributor’s name to the CONTRIBUTORS.txt file in the software repository.

And all of these contributors should get credit for their contributions.  But how should we record this?  Is there a way to record the type or value of each contribution?  And a way to add them up to determine the total amount of credit that should be given to each person? (If we could do both, we could use this to implement transitive credit.)

Project CRediT, which I wrote about a few years ago, is an attempt to describe types of credit for work involved in writing papers.  It would likely be possible to find a community view of the equivalent types of work involved in writing software projects. This would then allow recording contributions as they happen along qualitative lines, basically in a more complex CONTRIBUTORS file.

If authorship is considered a contribution that rises above some bar, how do we determine where that bar is, and how do we quantify contributions to check on their value in comparison? One idea is to borrow from the movie industry, where some contributions (e.g., the director’s) are always enough to be listed, which others (e.g., costume cleaning) may not be. I’ve been told that even for movies where there is a long list of credits at the end, with large numbers of people shown from various special effects companies, that the contract with those companies includes a maximum number of people who can be listed in the end credits, so these are not really complete listings.

I think determining where the bar is likely must be done by communities, somewhat in the same way that journals represent the work of communities (sometimes also referred to as clubs). However, the decisions about the value of individual contributions probably need to be made by each software project itself.

In summary, I think a short-term solution is:

  1. to collectively decide what types of software contributions there are,
  2. to decide, in the context of communities, where the bar for authorship is, and what types and amounts contributions should qualify,
  3. to record the individual contributions and their types, and
  4. to let the lead author make the final decision on authorship, with acceptance from the other authors, and with the peer-reviewers and editors representing the overall community confirming this decision.

In the longer term, we need to stop using the term author as the means of recognizing all significant contributions, and possibly go to a movie-like system where we name contributors and explain their contributions, and where author would be one of many types of contribution.

Acknowledgements: In addition to Titus’s and Luis’s blogs, previous discussions with Karthik Ram, Arfon Smith, and Matt Turk influenced my thinking on these issues.

Published by:

Daniel S. Katz

Chief Scientist at NCSA, Research Associate Professor in CS, ECE, and the iSchool at the University of Illinois Urbana-Champaign; works on systems and tools (aka cyberinfrastructure) and policy related to computational and data-enabled research, primarily in science and engineering

Categories RSE, Uncategorized4 Comments

4 thoughts on “Contribution vs Credit vs Authorship for software”

  1. Dan, nice post. Sophie Hou and I wrote a recent paper on this topic, https://doi.org/10.2218/ijdc.v11i1.357. One of the things we note in our paper, however, is that movie credits are a highly regulated form of credit. Most of the credit designations in movies are mediated by unions, guilds, etc. For example, screenwriting credits for hollywood movies have to be reviewed and approved by the Screenwriters Guild. To move in that direction within science would require the creation of analogous forms of institutionalized mediation.

    Another relevant point here is that this kind of approach was discussed at the American Meteorological Society meeting a few weeks ago and there was significant pushback, particularly by senior scientists. In specific, the concern cited most strongly was that it could make it harder for students, e.g. if a PI has an idea for a study that is then carried out and written up by a student, would the student be penalized when trying to get a job for not being the originator of the idea? In other words, there can be a double edge sword for transparency around work roles. Exposing that specific people contributed to tasks that are not conventionally considered to be important could a) raise the status of those tasks, or b) reduce the importance of the people doing those tasks. I would not be willing to make a prediction about which way the scale would tilt between those two.

    Like

      1. Just another thought. There is some research on the sociology of these issues of transparency of work roles. The most well known are studies about nurses, and how their work is (or isn’t) represented in hospital information systems. This article, by Bowker, Timmermans, and Star, talks about three key issues, comparability, control, and visibility, https://doi.org/10.1007/978-0-387-34872-8_21

        I could imagine thinking about breaking down academic authorship roles with these three issues in mind. E.g. Comparability – what is the granularity of the specifiable roles and are they internally consistent. Control – who gets to decide who gets slotted into what roles (authorship designations are already fraught with power issues). Visibility – as noted above, once something becomes visible, it can be more easily rewarded, and more easily dismissed.

        Like

Leave a comment