Open Infrastructures and the Future of Knowledge Production, part 2

In my last post, I unpacked some of the reasons why open infrastructures matter for the future of knowledge production, and I talked a bit about how Humanities Commons and hcommons.social strive to live out their principles of community governance that truly open infrastructure requires. But I ended on a less cheerleadery note: We aren’t a perfect alternative to the corporate platforms by which we’re surrounded. And this is where we need to dig down into the dirty underside of digital infrastructure. As Deb Chachra points out, the term “infrastructure” literally points to those systems that are hidden, in our walls, under our floors, and buried underground. If we are going to mitigate the inequities created by and sustained through our infrastructures, we have to get busy unearthing those systems and finding ways to build new ones. 

And so: We need to take a hard look at the fact that the infrastructure that Humanities Commons is built upon is AWS, or Amazon Web Services. As you might guess from the name, AWS is part of the Greater Jeff Bezos Empire, and every dollar that we spend to host with them helps to keep that empire running. And run it does! Amazon’s revenue derived from AWS passed $80 billion-with-a-b in 2022, and as of August 2023, AWS hosted 42 percent of the top 100,000 websites, and 25 percent of the top one million (ironically enough including BuiltWith, the site from which these data are made available).

Why has Amazon become such a powerful force in web hosting and cloud computing? Largely because they provide not just servers but a powerful and wide-ranging suite of tools that help folks like us not just make our platform available but also help keep it stable and secure and enable it to scale with enormous flexibility. AWS provides connected equipment and tools that would be more than a full-time job for someone to maintain in-house, and it enables redundancy and global reach at speed, and it’s relatively easy to manage.

So… it works for us, just as it works for 42,000 of the top 100,000 websites across the internet. But I’m not happy about it. It’s not just that I hate feeding more money into the Bezos empire every month, but that I know for certain that our values and Bezos’s do not align. And every so often I have to stop and ask myself how much good it does for us to build pathways of escape from the extractive clutches of Elsevier and Springer-Nature, only to have those pathways deliver us all into the gaping maw of Amazon?

AWS has a stranglehold on web-based platforms of our size, as we’re too complicated for a server kept under the desk, too big for a smaller hosting service, and too small for our own data center. And if you don’t want to deal with the risks and costs involved in owning and operating the metal yourself, there just aren’t many alternatives, and certainly not many good ones.

Our host institution, Michigan State University, like most institutions its size, operates both a large-scale data center through our central IT unit and a high-performance computing center under the aegis of the office of research and innovation. The latter can’t really help us, as it’s focused pretty exclusively on computational uses and not at all on service hosting. And the former comes with a suite of restrictions and regulations in terms of access and security – pretty understandably so, given recent attacks and exploits such as the one that caused our neighbor to the east to disconnect the entire campus from the internet on the first day of classes – but nevertheless restrictions that make it impossible for us to be flexible enough with our work.

In fact, central IT strongly encourages projects like ours to make use of cloud computing, given the complexity of our needs and the risk-averseness of the campus. And we have our pick! AWS, Microsoft’s Azure, and Google Cloud Services.

I just can’t help but think that it’s a Bad Thing for academic and nonprofit services like ours – services that are working to be open, and public, and values aligned with our communities – to be dependent upon Silicon Valley megacorps for our very presence. We need alternatives. Real alternatives. And I fear that we’re going to have to invent them, because as the example of open access publishing demonstrates, waiting to see what commercial providers come up with is certain to increase our lock-in, and increase the level of resources they extract from our campuses.

So what might it look like if our infrastructure for the future of knowledge production and dissemination was community-led all the way down? What might enable the Commons to leave AWS behind and instead contribute our resources to supporting a truly shared, openly governed, not-for-profit cloud service? Could such a service be collaborative, with all member research institutions and organizations paying into a shared, professionally staffed data center?

King’s College London and Jisc think so – they established the first collaborative research data center in the world nine years ago, precisely in order to help UK institutions achieve economies of scale, to increase energy efficiency, and to reduce costs. Of course, it’s a lot easier to get all the UK institutions of higher education on board with such a centralized initiative, partly because there are fewer of them and partly because they are all centrally funded.

But what if Internet2, for instance, instead of restricting its areas of interest to networking and protocols, and instead of offering to connect member institutions with corporate cloud services, instead provided a real alternative – one that was not just developed for the academic community but that would be governed by that community? What if each member institution or organization agreed to contribute its existing infrastructure, along with its annual maintenance budget, to a shared, distributed, community-owned cloud computing center? Could excess capacity then be offered at reasonable prices to other nonprofit institutions or organizations or projects like mine, in a way that might entice them away from the Silicon Valley megacorps? Would our institutions, our libraries, our publishers, and our many other web-based projects find themselves with better control over their futures?

None of what I’m suggesting here would be easy, and a lot of the questions I’ve just asked fall – at least for the moment – into the realm of the pipe dream. But if we were to be willing to press forward with them, we might find ourselves in a world in which the scholarly communication infrastructures on which we build, develop, design, and publish our work can help us foster rather than hinder social and epistemic justice, can empower communities of practice by centering their needs and their work to meet them, and can enable trustworthy community governance and decision-making in support of truly open, public, shared infrastructures for the future of knowledge production.

Open Infrastructures and the Future of Knowledge Production, part 1

I’ve been thinking a good bit lately about the ways that the future of knowledge production depends upon the openness of the infrastructures that support our work. For a lot of people, the word “infrastructure” triggers a yawn reflex, and not without reason. As Deb Chachra points out in her brilliant new book, How Infrastructure Works, the best thing that infrastructure can do is remain invisible and just work. But as Chachra also argues, the shape of our entire culture is dependent on our infrastructure, and where inequities are part of those systems’ engineering, they constrain the ways that culture can evolve. Infrastructure matters enormously, and the scholarly communication infrastructures on which we build, develop, design, and publish our work have deep implications for our abilities to foster social and epistemic justice in our knowledge production and communication practices, to empower communities of practice and their concerns in the development and dissemination of knowledge, and to enable trustworthy governance and decision-making that is led by the communities that our publications and platforms are intended to serve. Our team is far from alone in thinking about these questions right now. We’re seeing the idea of “open infrastructure” pop up a lot lately, in no small part because folks are recognizing that a commitment to open, public infrastructures is necessary to ensure that scholarly communication can become actually equitable.

What do I mean by “actually equitable”? How might that sense of equity intersect with the aims of the open-access movement? Over the last twenty-plus years that movement has worked to transform scholarly communication, arguing in part that if our work could be read more openly by anyone, it might both have more impact on the world at large and create a more equitable knowledge environment. It’s of course true that open access in its many present flavors has done a lot to make more research available to be read online. But the movement toward open access began as a means of attempting to break the stranglehold that a few extractive corporate publishers have established over the research and publishing process – and in that, it hasn’t succeeded. The last decade in particular has revealed all of the resilience with which capital responds to challenges, as those corporate publishers have in fact become more profitable than ever. Not only have they figured out how to exploit article processing charges in order to make some work published in their journals openly available while continuing to charge libraries for subscriptions to the journals as a whole, but they’ve also developed whole new business plans like the so-called “read and publish” agreements that keep many institutions tied to them, and they’ve developed new platforms and infrastructures like discovery engines and research information management systems that serve to increase corporate lock-in over the work produced on campus.

For all these reasons, the 20th anniversary statement of the Budapest Open Access Initiative took on a slightly different focus, noting that “OA is not an end in itself, but a means to other ends, above all, to the equity, quality, usability, and sustainability of research.” In order to achieve those ends, the statement proposes several key recommendations – and chief among them?

Host OA research on open infrastructure. Host and publish OA texts, data, metadata, code, and other digital research outputs on open, community-controlled infrastructure. Use infrastructure that minimizes the risk of future access restrictions or control by commercial organizations. Where open infrastructure is not yet adequate for current needs, develop it further.

This recommendation recognizes that the control of the infrastructure by profit-seeking entities cements inequities – and this is true even where the large corporate publishers purport to create opportunities for the disadvantaged by offering fee waivers and discounts on their publishing charges. Those discounts only serve to normalize a culture in which it is considered correct for those who produce knowledge to pay corporations to host and circulate it.

What scholarly communication needs today, more than anything, is a broad-based sense of accountability to scholars and fields and institutions rather than shareholders. Hence the call in the 20th anniversary Budapest statement for hosting open access research on open infrastructure: infrastructure that is led by us, and accountable to us.

This is the fundamental orientation and driving purpose of Humanities Commons. Our goal is to provide a non-extractive, community-led and transparently governed alternative to commercial platforms. We also want to encourage our users to rethink the purposes and the dynamics of publishing altogether, in ways that might allow for the development of new, open, collective, equitable processes of creating and sharing knowledge that re-center agency over the ways that scholarly work develops and circulates with the scholars themselves. As a result, we have put in place a participatory governance structure that enables both individual users and our institutional sustaining members to have a voice in the project’s future, and we have developed network policies that emphasize inclusion and openness. We are committed to transparency in our finances, and most importantly to remaining not-for-profit in perpetuity.

We are also working to build and sustain the kinds of new platforms and services that will allow for rich conversations among members of our community and between that community and the rest of the world. A year ago, seeing the handwriting on the wall for the platform formerly known as Twitter (and frankly having suffered through quite a number of unhappy years there before the beginning of the end), we launched hcommons.social, a Hometown-flavored Mastodon instance, in the hopes of providing a collegial, community-oriented space for informal communication among scholars and practitioners everywhere. We currently have more than 2000 users on our instance who are connecting with users throughout the Fediverse, and we support those users through a strong moderation policy and code of conduct. We also work to ensure that new policies and processes are discussed with that community before they’re implemented.

This kind of openness matters enormously, not just to ensure that we’re living up to the values that we’ve established for our projects, but to ensure that there’s a worthwhile future for them. Cory Doctorow has written extensively of late about what he has famously called the “enshittification” of the internet, a process in which value is sucked out of the community and into the pockets of shareholders. Users are left with no control over the platform, or the content they’ve provided to it. And this, he notes in a post on the new corporate platforms seeking to replace Twitter, remains true even if their C-suite is populated by good actors, because they’re still walled gardens.

The problem with walled gardens is partly about their ownership, but largely about their governance. It’s not just that the owners of any particular proprietary network might turn out to be racist, fascist megalomaniacs – it’s that we have no control if and when they do. Choosing open platforms means that we as users have a say in the future of the plots of ground we choose to develop. This is especially true for the kind of work, like knowledge production, that is intended to have a public benefit. It’s incumbent on us to ensure that those gardens aren’t walled, that they don’t just have a gate that management may one day decide to unlock to let select folks in or out. Rather, our gardens must be open from the start, open to connect and cultivate in the ways that we as a community decide.

As Doctorow notes, Mastodon is far from perfect, and as much as I love our own instance, hcommons.social is far from perfect. But we’re doing our best to ensure that we’re running it in the open. And operating in the open, both for the Commons and for hcommons.social, means for us that we are accountable to our users and responsible for safeguarding the openness of their work. Together, those two ideals undergird our commitment to provide alternatives to the many platforms that purport to make scholarly work more accessible but in fact serve as mechanisms of corporate data capture, extracting value from creators and institutions for private rather than public gain.

But, as I note, we aren’t a perfect solution to the problems of corporate control in scholarly communication. More on why in my next post.