Open Science and its Discontents

Open science has well and truly arrived. Preprints. Research Parasites. Scientific Reproducibility. Citizen science. Mozilla, the producer of the Firefox browser, has started an Open Science initiative. Open science really hit the mainstream in 2016. So what is open science? Depending on who you ask, it simply means more timely and regular releases of data sets, and publication in open-access journals. Others imagine a more radical transformation of science and scholarship and are advocating “open-notebook” science with a continuous public record of scientific work and concomitant release of open data. In this more expansive vision: science will be ultimately transformed from a series of static snapshots represented by papers and grants into a more supple and real-time practice where the production of science involves both professionals and citizen scientists blending, and co-creating a publicly available shared knowledge. Michael Nielsen, author of the 2012 book Reinventing Discovery: The New Era of Networked Science describes open science, less as a set of specific practices, but ultimately as a process to amplify collective intelligence to solve scientific problems more easily:

To amplify collective intelligence, we should scale up collaborations, increasing the cognitive diversity and range of available expertise as much as possible. This broadens the range of problems that can be easy solved … Ideally, the collaboration will achieve designed serendipity, so that a problem that seems hard to the person posing it finds its way to a person with just the right microexpertise to easily solve it.

Attempts to reform the way we do science have been underway for decades, from arXiv in 1990s, to open access publishing in the early 2000s. The degree to which any scientific field practices open science varies considerably, but it’s pretty fair to say that the institutional embracement of open science hasn’t exactly been speedy despite demonstrated successes in the physical sciences and mathematics such as the Polymath project and Galaxy Zoo. In physics it is had been mainstream for a while now to release manuscripts first on arXiv and big data sets through standardized repositories. In the biomedical sciences, progress has been considerably slower, perhaps due its larger institutional and financial footprint: it’s the proverbial large supertanker than needs a long time to turn around, let alone move in different direction. Whatever the reason, open science is now firmly on the radar, and it has unleashed a torrent of opinion and criticism, examining all aspects from its practicality to its desirability.

Although there is a spectrum of responses, criticism of open-science tends to fall into one of two camps, that I will call “conservative” and “radical”. This terminology is not intended to imply an association with any conventional political labels, they are simply used for convenience to indicate the relative degree of comfort with the institutional status quo. Let’s look at these two groups of critiques.

The conservative critique: what are all these damn people doing with my data?

The conservative response to regular timely release of pre-publication data could be best summarized by the phrase: “are you kidding me? why would I do that?” The apotheosis of this notion was appeared in an editorial published in the New England Journal of Medicinewhich described with some horror the “emergence of a new class of research parasites”. They further concluded that some of these parasites might not only use that data for their own publications, but might seek to examine whether the original study was correct. Many scientists took to Twitter to express their amazement that anybody would object to a re-examination of the data, since falsification is presumed to be the backbone of the scientific enterprise and use of the hashtag #IAmAResearchParasite was trending for several days.

From the perspective of the current incentive system, however, this response is totally rational. In a model where labs or principal investigators are largely funded on “high-impact” papers and grants, there is intense pressure to keep the lid buttoned on data as long as possible, even if a collaborative process could, in principle, produce a better result. (In software world this is often summarized by the phrase attributed to Erik Raymond: “with enough eyeballs, all bugs are shallow”). It’s also a collective action problem: if all scientists were to release their data, then it would easier for individuals to buy into a more frequent release of data.

The second area where open-science approaches run into another entrenched form of institutional power is the battle over preprints. Preprints are well established in physics, and about 80% of pre-prints end up in “traditional journals”. In biology, however, allowing preprints to then be accepted by traditional publishers has been fiercely resisted especially by “high-impact factor” journals published by publishing conglomerates like Elsevier, ever since e-Biomed was proposed as a biology-version of arXiv back in 1999. It’s my sense that these publishers are reading the tea-leaves and realizing, probably quite rightly, that eventually scientists will just cut out of the middleman (the journal). Mike Eisen, one of the pioneers of e-Biomed and PLOS has in fact explicitly proposed that we should eventually just do away with journals and move to a complete preprint + post-publication peer review system. Obviously a nightmare scenario if you’re the head of multibillion dollar highly profitable publishing conglomerate that benefits from the free labor of scientists (peer review has been estimated to be worth ~1.9 billion British pounds per year).

One of the usual counterarguments to post-publication peer review is that it will produce a flood of lower quality papers. The refrain is: how will I know what to read, now anybody can publish? This can indeed be an issue but it need not be insurmountable. Nielsen and others have proposed a publish-then-filter model. New experiments like Science Matters, which publish single observations and The Winnower which archives grey literature and blogs can also provide these purposes. One thing seems clear, especially after the Accelerating Science and Publication in Biology (ASAPbio) meeting this Spring that the tide, in biomedical science at least, seems to be turning towards the acceptance of preprints, notably via the biology-specific preprint server bioRxiv. Over time this may lead to an increased willingness to explore different publishing models.

Of course, all these changes are really just baby steps in a truly fully-fledged open science, because they still enshrine a “scientific paper” as the sole end goal. Other aspects of the open science movement, including building tools for reproducibility, “rewarding” non-paper research products such as code, infrastructure and raw data itself promise to be equally important, but despite the progress made in the last 6 years, are activities that are still largely unrewarded by the current academic system. It will probably take a true generational turnover before we see a full-throated embrace of open science, and many of those driving the changes have concluded that it is best done outside traditional academic institutions. In fact, thinking about open-science solely in terms of economic incentive structures, may be a wrong, or at least incomplete, way to think about open-science, which leads me to…

The “radical” critique: be careful what you wish for.

Arguments for open-science made in response to the conservative critique tend to assume that release of more data, code, papers is a pure good in and of itself, and downplay the political economy in which they are embedded. Indeed, as I just argued above above: a fertile intellectual commons that all scholars, professional and otherwise can use to pursue their own intellectual adventures is a worthy goal and, in an ideal world, should lead to a truly more democratic science. However an interesting paper by sociologist David Tyfield: “Transition to Science 2.0: “Remoralizing” the Economy of Science” says essentially: not so fast.

Tyfield suggests that the release of vast troves of data, papers or research results although potentially beneficial to science as an enterprise, could simply exacerbate the trends towards the increasing marketization and corporatization of science and will disproportionately benefit large corporations. There are several trends that worry Tyfield and other scholars such as Philip Mirowski, Gary Hall and Eric Kansa, including:

  1. the capturing of publically-funded research value by commercial platforms

  2. open-science will simply consolidate a different set of gatekeepers and introduce yet more “metrics” of productivity used to “incentivize” scholars to work harder

  3. and a focus on system-wide progress of science ignores costs and benefits to individual humans, scientists or non-scientists

Let’s take them in order.

1. Capturing of academic labor output by commercial interests

Academia.edu has probably become familiar as a destination for scholars of all stripes to share their work via archiving their PDFs. Recently they sent an email to select participants to join their editor program, and be an unpaid editor for the site to recommend publications appearing on the site to others that area of research expertise. This move was roundly criticized and led to another Twitter hashtag: #DeleteAcademiaEdu. Gary Hall wrote a paper ““Should This Be the Last Thing You Read on Academia.edu?” (available on Academia.edu!) comparing Academia.edu’s business model to Uber, noting that

…the majority of academics who are part of Academia.edu’s social network are the product of the state-regulated, public higher education system, as is their research (a system, it should be said, from which public funding is steadily being withdrawn). But just as Airbnb and Uber are parasitic on the public ‘infrastructure and the investment’ that was ‘made by cities a generation ago’ (roads, buildings, street lighting, etc.), so Academia.edu has a parasitical relationship to the public education system, in that these academics are labouring for it for free to help build its privately-owned for-profit platform by providing the aggregated input, data and attention value.

My own sense is that those running Academia.edu and ResearchGate are in it for idealistic reasons, and don’t see themselves as the next rapacious Uber-like company, but Hall’s point is that the business model that they operate under may force them to become increasingly extractive.

2. The tyranny of metrics

The second aspect is more subtle, as already scholars are ranked in terms of their “productivity” as measured through papers and grants. The counter-argument to the conservative reaction against against open-science is that it brings more research outputs into that system, thus “incentivizing” the publication or release of intermediate results, useful research by-products, code and the like. As I noted above these pieces can represent an intellectual commons where teams of scholars or individuals can build upon their release, thus incentivizing the generation of “public goods”. On this issue, Eric Kansa, a digital humanities scholar and practitioner of open data in an article “It’s the Neoliberalism, Stupid: Why instrumentalist arguments for Open Access, Open Data, and Open Science are not enough“ exposes the limitations of relying on open-science metrics alone:

Metrics, even better Alt-metrics, won’t make researchers or research more creative and innovative. The crux of the problem centers A Hunger Games-style “winner take all” dynamic that pervades commerce and in the Academy. A rapidly shrinking minority has any hope of gaining job security or the time and resources needed for autonomous research. In an employment environment where one slip means complete ejection from the academy, risk-taking becomes quasi-suicidal. With employment increasingly precarious, professional pressures balloon in ways that make risk taking and going outside of established norms unthinkable. Adding more or better metrics without addressing the underlying job security issues just adds to the ways people will be ejected from the research community.

Metrics, while valuable, need to carry fewer professional consequences. In other words, researchers need freedom to experiment and fail and not make every last article, grant proposal, or tweet “count.”

Kansa says that metrics, while useful up to a point, can be counterproductive because they simply add more steps in the treadmill in which scientists already operate. In other words, even though science as a whole may benefit, individual actual humans scientists may not. After all, if your job depends on producing evermore research “products” every year (whether open or not) then adding yet another set of outputs to please a search or tenure committee doesn’t seem like much fun. Because unless the fundamental model for hiring and funding changes and university administrations stop treating science as a business that must “grow”, new open-science “outputs” won’t substitute for papers and grants, they’ll just be added to them.

3. Who benefits?

Tyfield goes further in his analysis and suggests that the locus of progress is at the wrong level, and that open-science prioritizes “scientific progress” in the abstract, above improving the lot of the individual humans that comprise it:

Yet, as we have seen above, one needs only to ask the more humdrum question of “where are the jobs?” to see that the focus in such accounts is firmly at the system level of the global “data web” and the accelerated “progress” of “science” while totally neglecting that of the human individual and his/her place in such a society. It thus fails (and is likely to be seen to do so relatively quickly) precisely the test of moral economy that has triggered the breakdown of the passing order and the pursuit of transition to another: it massively rewards the undeserving and impoverishes the many and fails to deliver the “goods” of more, better and more-democratically-engaged knowledge that tackles the urgent and “wicked” problems of the multiple environmental, health and resource-based crises.

Further, he argues that open-science could:

undermine the compensation of human knowledge labour—albeit under the seemingly democratizing banner of “free information”—upon which such a system is entirely dependent. Furthermore, this analysis not only thereby destroys the “livelihoods” of a genuinely “creative” “knowledge-based economy” but also sponsors the construction of a system characterized by even more concentrated corporate control of knowledge (eg. information, data) than that of the IP-intensive corporate model of neoliberalism it ostensibly subverts.

Mirowski is even more blunt in his assessment of open-science:

It would be misguided to infer that Science 2.0 is being driven by some technological imperative to ‘improve’ science in any coherent sense. Rather, the objective of each and every internet innovation in this area is rather to further impose neoliberal market-like organization upon the previously private idiosyncratic practices of the individual scientist….Open Science 2.0 does not exist to democratize or otherwise improve research. Rather, it is engineered to position a few large firms at the electronic portals of the modern commercialization of knowledge.

This seems to me an excessively pessimistic view of open-science. Many of the most exciting initiatives in open science have been grass-roots driven efforts, especially the push towards preprints in biology. It’s hard to see how the rise of preprints is more market-like than already exists in the rush towards submission in prestige journals.  I have personally supported open science for many years for it’s potential to improve reproducibility, and to produce open-source scientific software that is beneficial to all. (Releasing scientific software under open source licenses is now a stated goal of the scientific establishment: in days past I would have needed to spend a good deal of effort convincing otherwise skeptical senior colleagues and universities of the value of open-sourcing my own efforts. Now this is less necessary, which counts as some kind of progress).

Nevertheless, Tyfield and Mirowski are right to point out the dangers of a pollyannaish view of the digitization of scientific practices. After all, a democratic, decentralized and open-source ethos was part of the founding principles of many of the now-dominant market players in the digital economy such as Google back in the late 1990s and early 2000s. And as these companies have grown to become even more powerful, many of their vaunted principles have given way to a more winner-take-all approach.

It is therefore not unreasonable to be concerned that similar dynamics could occur in the open science world. No doubt they will. But, that is different to saying that all open-science practices are being explicitly engineered towards the undesirable neoliberal outcome described by these authors. The challenge is partially biological in nature: how to create a system where co-operative behaviour lead in which the benefits of open science practices are spread across many individuals and one that resists the encroachment of cheaters.

Reframing the question

Many arguments made in support or in opposition to open-science are ultimately unsatisfying because they both frame “success” as individual scientists adapting to existing and fixed institutional structures and norms. We should, instead, turn this question around and ask how do we use open science approaches in the context of retooling our institutions to benefit actual living and breathing humans (scientists and nonscientists)? How can we use open science to enable as many people who have the interest and talent to pursue science for it’s own sake and to generate knowledge that is broadly useful for society, and not just elite institutions, venture capital firms or global megacorporations? This reframing takes the conversation out of the instrumentalist language of “carrot-and-sticks”, “rewards” and “incentives”, since as previously discussed here on the Ronin blog, any such system can be used to service questionable ends. We should, as Ernesto Priego says, be “working towards a type of scholarship which is about learning from each other, not about surveillance and gatekeeping”. It also means prioritizing open science approaches that benefit not only institutions or science in the abstract but help improve the lot of individual working scientists.

So what would this look like? If traditional measures of research “quality” (see “Excellence R Us: University Research and the Fetishisation of Excellence”) and progress of science in the abstract is not sufficient, how will we know if open science is helping to drive an inclusive and humane approach to science? A partial list of what this might look like would include: “permission-less” innovation (the end of paywalls, end-user license agreements, data embargoes would be extended to all, not just institutionally-based researchers); a more equitable distribution of power and resources (a shift away from massive labs and distribution of funding geared towards living wages for the many, rather than stability and large rewards for an elite few); a rise in independent scholarship (these might be considered the “research parasites” of the NEJM, but it would not be a one-way street, independent scholars would contribute back to the commons, perhaps by releasing data under GPL copyleft-style licenses); and an open notebook science that is structured to enable learning and not surveillance. There is a obviously a lot of overlap with more traditional arguments for open-science with which I fully agree, but the structural nature of the actual working conditions for scientists and the political economy in which they are embedded need to be kept firmly in focus.

How to get there?

So how can we begin to build a human-centered open-science? Here’s a few places to start:

1) Strengthening existing public institutions such as libraries to support open science. Shrinking library budgets have reduced the ability for libraries to perform many of their core functions, let alone the new ones that scholars need. This is an area where there is an increased role for the state (via funding and support) provide the core infrastructure for open science that is not subject to the vagaries of the market.

2) Explore platform cooperativism and commons-based models for open-scienceThe decentralized architecture of the Internet and the libertarian promises of the so-called “sharing economy” have not magically created a nirvana where people get paid or credited fairly for value created. (Scholars from the social sciences, have generally been way ahead of scientists and technologists in recognizing these trends). Ownership and governance models matter much more than the technical architecture. As we develop open science platforms we should follow and draw inspiration from the platform cooperativism movement in which users have at least partial ownership or control of the platform, rather than simply being passive nodes in an “on-demand” economy as in Uber or AirBnB.

3) Push for larger-scale social and economic changes In the long-run perhaps only larger socioeconomic changes will be sufficient to underwrite the ideal of an inclusive open-science. One such change gaining steam is the push towards a universal basic income (UBI): paying all citizens a fixed basic income regardless of circumstances. Sociologists such as Guy Standing have argued that the rise of automation and the precarity of work (already a reality with the postdoctoral scholar glut) will eventually make something like a UBI a necessity (the exact form it takes, however, is critical). In science, a UBI could enable the true benefits of open science approaches by decoupling job security from arbitrary notions of research “productivity”. By extending the privilege of pursuing whatever truly interested them currently only enjoyed (in principle) by tenured faculty, to scientists that currently lack such job protections, would take a huge amount of pressure off young investigators that currently feel a need to squeeze as much work out of every dataset, graduate student or postdoc. Of course many kinds of science require more resources than a single faculty member’s paycheck, but many labs are likely bigger than they “need” to be and a UBI could have the effect of reducing the pressures (real or imagined) to “grow” one’s lab (see the “Problem with Building a Group” in Kitsune #2).

The naysayers out there (whether from the “conservative” or “radical” camp)  are likely to scoff at many, if not all, of these proposed changes and dismiss them as non-starters, or quibble with the exact details. The current funding climate certainly doesn’t favour changes, but that doesn’t mean that change isn’t possible. We can start now.

Leave a Reply

Your email address will not be published. Required fields are marked *