As I’m getting ready for the first meeting of the Chan-Zuckerberg Initiative’s Essential Open Source Software for Science program, with grantees, advisors, funders, researchers, and industry partners, and with the goal “to identify shared needs and how funders might support these needs in the future“, I wonder how much software maintenance should cost, and how we might determine that answer.
Software maintenance is important to software sustainability due to avoiding software collapse to keep the software working, software bugs that need to be fixed, new features that need to be added to the software, and new platforms on which the software could be run.
Many funders support the development of open source software, such as CZI, the Sloan Foundation, NSF, DOE, NIH, etc., and most of these support maintenance as an element of development, but CZI might be unique in funding maintenance by itself.
In addition, various governments and policy makers support the concept of public sharing of publicly funded research at various levels, including open access, open data, and open software. Typically, there is little cost to distribute these products, and fairly low cost to maintain them in a readable state, and this can be done by curators who have general skills related to publications, data, etc. But maintaining software in a reexecutable, reusable, rebuildable, and redevelopable state has higher costs, involving more human work, and needs to be done by the developers, or at the least, people with detailed knowledge of the specific software.
We care about openness for multiple reasons, including that publicly funded work should be available to the public, that open sharing is needed for full understanding and reproducibility, and that open sharing reduces unneeded duplication of work.
Another issue is that the total amount of software grows over time. Much software is added, some of which replaces other software, and some of which is new. So some existing software stops being used because it has been replaced, and some stops being used because the problems it was designed to solve are no longer being solved.
So if we want software to be available and usable, how much should we budget for this?
Do we want to look at the cost of maintaining individual software packages, or the overall cost of maintaining the software ecosystem?
If we look at individual packages, who would decide which should be maintained and which should not? And how would we determine who should be responsible for the maintenance? The original funders perhaps? The latest funders? All funders?
Could we use some usage or impact metrics to decide which software should be maintained? And to determine who should support the maintenance?
Could we imagine that maintenance costs are tied to development costs for a funding program? Is there an X, such that a funding agency should dedicate to software maintenance X% of the funds it uses for software development? (I’ve seen 5% mentioned for this, but I don’t think there is any evidence for 5 to be the right number.)
I don’t have answers to these questions, but I want to encourage us to consider them, and to think about what information we would need to answer them.