How much should software maintenance cost?

As I’m getting ready for the first meeting of the Chan-Zuckerberg Initiative’s Essential Open Source Software for Science program, with grantees, advisors, funders, researchers, and industry partners, and with the goal “to identify shared needs and how funders might support these needs in the future“, I wonder how much software maintenance should cost, and how we might determine that answer.

Software maintenance is important to software sustainability due to avoiding software collapse to keep the software working, software bugs that need to be fixed, new features that need to be added to the software, and new platforms on which the software could be run.

Many funders support the development of open source software, such as CZI, the Sloan Foundation, NSF, DOE, NIH, etc., and most of these support maintenance as an element of development, but CZI might be unique in funding maintenance by itself.

In addition, various governments and policy makers support the concept of public sharing of publicly funded research at various levels, including open access, open data, and open software. Typically, there is little cost to distribute these products, and fairly low cost to maintain them in a readable state, and this can be done by curators who have general skills related to publications, data, etc. But maintaining software in a reexecutable, reusable, rebuildable, and redevelopable state has higher costs, involving more human work, and needs to be done by the developers, or at the least, people with detailed knowledge of the specific software.

We care about openness for multiple reasons, including that publicly funded work should be available to the public, that open sharing is needed for full understanding and reproducibility, and that open sharing reduces unneeded duplication of work.

Another issue is that the total amount of software grows over time. Much software is added, some of which replaces other software, and some of which is new. So some existing software stops being used because it has been replaced, and some stops being used because the problems it was designed to solve are no longer being solved.

So if we want software to be available and usable, how much should we budget for this?

Do we want to look at the cost of maintaining individual software packages, or the overall cost of maintaining the software ecosystem?

If we look at individual packages, who would decide which should be maintained and which should not? And how would we determine who should be responsible for the maintenance? The original funders perhaps? The latest funders? All funders?

Could we use some usage or impact metrics to decide which software should be maintained? And to determine who should support the maintenance?

Could we imagine that maintenance costs are tied to development costs for a funding program?  Is there an X, such that a funding agency should dedicate to software maintenance X% of the funds it uses for software development? (I’ve seen 5% mentioned for this, but I don’t think there is any evidence for 5 to be the right number.)

I don’t have answers to these questions, but I want to encourage us to consider them, and to think about what information we would need to answer them.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Published by:

danielskatz

Assistant Director for Scientific Software and Applications at NCSA, Research Associate Professor in CS, ECE, and the iSchool at the University of Illinois Urbana-Champaign; works on systems and tools (aka cyberinfrastructure) and policy related to computational and data-enabled research, primarily in science and engineering

Categories Uncategorized2 Comments

2 thoughts on “How much should software maintenance cost?”

  1. I think the credit assignment problem is the hardest problem here. And I think the funding should be somehow based on need. If you look at the usage statistics, I would assume git is one of the most highly used tools in research software by most metrics I can think of. But I don’t think funders throwing money at git is the most useful thing to do. I think (some) tools that also have wide commercial use are already well-supported. So we should think about the funding gap, and what we want the reponsibility of funding agencies vs industry should be there.
    So I don’t think the main question is how much the maintenance should cost, but who should fund which part.

    Generally I think basing funding on usage metrics would be good, but again that doesn’t account for the problem above. If you think citations are a good metric, how about one full time dev per 1000 publications using it (per year)?

    One way to attack the credit assignment problem would be to ask people where they think funding should go. This is obviously not trivial to implement. I would say if you want to do a percentage, you shouldn’t use a percentage of the software grants, but a percentage of all grants. I think NSF shouldn’t give 5% of its software grants to maintenance, it should give some percentage (1%, 0.1%?) of all grants to the software that made these research grants possible.

    I think tying anything to previous funding is not a good idea: it means that building something without funding makes is less likely to be funded, which I think is a situation we need to avoid at all cost.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s