FAIR is not fair enough

The FAIR data principles, defined as “a set of guiding principles to make data Findable, Accessible, Interoperable, and Re-usable,” came out of a meeting in Jan 2014 that “brought together 25 high level participants representing leading research infrastructures and policy institutes, publishers, semantic web specialists, innovators, computer scientists and experimental (e)Scientists.”

The idea of FAIR seems to be catching on, and potentially being applied to other types of objects, such as software.  For example, a recent paper, “Four simple recommendations to encourage best practices in research software” (of which I am one of many co-authors), says:

“While the FAIR principles were originally designed for data, they are sufficiently general that their high level concepts can be applied to any digital object including software. Though not all the recommendations from the FAIR data principles directly apply to software, there is good alignment between the OSS recommendations [the software recommendations in the paper] and the FAIR data principles”

However, it is clear to me that settling for FAIR is not really fair enough.

First, note that FAIR doesn’t actually require the objects (e.g., data, software) to be openly available.  The FAIR principles narrowly define accessibility as the metadata protocol being open and that the metadata themselves are accessible, not the data (or in our case, software.)  As I might have once heard on a playground, “not sharing is not fair.”

Some might argue that the spirit of the FAIR principles is more generous than the letter, and this may be true.  Many readers of the FAIR principles may take them to mean that objects should be available.  But this is not what the principles actually say, and I think this is important.

Second, FAIR doesn’t include the idea of credit. In academia, credit is important, as it’s currently a key factor in hiring and promotion. However, there are philosophical differences regarding credit. For example, the open source software movement is generally not concerned with credit, though efforts like Depsy and Libraries.io are trying to inject it. This idea of creating works without being concerned about credit has also been taken up by the Maker movement.  Influenced by both of these, Cory Doctorow in a recent book, Walkaway, suggests a future where the idea of credit is strongly discouraged, with a main character stating, “We’re making a world where greed is a perversion.”

In general, our global society uses credit as an incentive to encourage production of objects and ideas, where credit is financial, intellectual, or academic.  A commons model, which works in small groups, does not use credit in this way, but rather, encourages production that benefits the group as a whole.  I personally do not believe that this model scales to larger groups, and thus, I believe credit is essential. In addition to the personal benefit to credit, there is a group benefit: recognizing and using the expertise of individuals is how large communities function effectively.

An example of another group that is adding to FAIR is the FORCE11 Scholarly Commons working group, a group that is “exploring what is required for a scholarly communication ecosystem designed for 21st century scholarship,” where we seem to be moving towards “open, FAIR, and citable” as a goal for objects in the scholarly commons, with the idea that these three concepts can leverage each other. For example, see the description of course AM2: Scholarship in the 21st Century in this summer’s FORCE11 Scholarly Communications Institute. However, this is by no means a settled issue within this working group.

In summary, I think that products that are not open are unlikely to fully benefit the research community, and while I am somewhat sympathetic to the concept of a world where credit is not important, I don’t think it’s realistic.  Thus, I believe this larger set of attributes, “open, FAIR, and citable,” are more fair than just FAIR, and are much more likely to lead to more and better research.


Thanks to Melissa Haendel, Fiona Murphy, and Daniel Paul O’Donnell for useful feedback, though all opinions (and any errors) are mine.

 

Advertisements

Published by:

danielskatz

Assistant Director for Scientific Software and Applications at NCSA, Research Associate Professor in CS, ECE, and the iSchool at the University of Illinois Urbana-Champaign; works on systems and tools (aka cyberinfrastructure) and policy related to computational and data-enabled research, primarily in science and engineering

2 Comments

2 thoughts on “FAIR is not fair enough”

  1. Hi Daniel,
    I appreciate that you, and many others, are wondering about what FAIR means for you and your communities. We wrote a follow up [1] to the original paper [2], which address some of your concerns, particularly why FAIR does not specifically emphasize Open.

    I think we should strive to make as much content open and freely accessible as possible. However, for many communities, making all content open is simply not an option. One example that is pertinent to me and my community, is the use of patient data for biomedical discovery. Patient data is particularly sensitive and there is a lot of legislation that makes it illegal to distribute patient data. Thus, our concern is that where research results make use of sensitive data, there must be a clearly described mechanism, whatever that may be, in which access to the data is possible – FAIR Principle A1. However, this mechanism must be described in the metadata for the resource, and we strongly believe that metadata should always be made available, regardless of the availability of the data – FAIR Principle A2. This enables people (and machines) to potentially find, negotiate, and reuse access-restricted content.

    The second point you raise is about citation. Citation is a a reference to another work. The ability to cite is important aspect of acknowledging that another work is relevant to the work at hand. Thus, there are three pieces needed for citation. The first is that the work to be referenced can be uniquely identified. This is addressed by having a globally unique and persistent identifier – FAIR Principle F1. The second is that the relationship between different entities must be qualified – such that we can specifically indicate the relationship that the referred to work has in the current work – FAIR Principle I3, and finally, that to be FAIR, we include detailed provenance for our work, specially addressed in FAIR Principle R1.2. These, in combination, ensure that a work is FAIR when it provides citation to other relevant work.

    I hope that helps to understand why we believe that the FAIR principles are really quite fair. The FAIR principles provide a solid foundation to enable the broadest set of uses from all sorts of communities.

    [1] http://content.iospress.com/articles/information-services-and-use/isu824
    [2] https://www.nature.com/articles/sdata201618

    Liked by 1 person

  2. Thanks for your comment, Michel. Please note that I did not say that I thought FAIR was not fair (it is), just that I think it does not go far enough towards the scholarly system and culture I would like to see. My concern is that people will settle for FAIR, instead of trying to go further.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s