Mark Hahnel on the Future of Open Science, Publishing, and Figshare
- Podcast
- January 23, 2023
In this episode, Nikesh Gosalia talks to Mark Hahnel, Founder and CEO of Figshare. Mark talks about his background in genomics and how a gap year spent travelling eventually led him to do a PhD in stem cell biology. As a PhD student, the struggles he experienced trying to publish his research findings inspired him to build his own publishing platform, which is now known as Figshare. For Figshare’s 10th anniversary, Mark reflects on Figshare’s collaboration with Digital Science and how the visionary thinking of their CEO led to the birth of Figshare before open science became common practice.
Mark talks about the different versions of Figshare, from the free figshare.com which is open to everyone, to purpose-built versions of Figshare meant for specific organizations. This ties in with the concept of “as open as possible, as closed as necessary”, where not all datasets should be publicly available for safety. Mark shares his thoughts on what academics want out of publishing: something fast, good, open, and possibly free. He also addresses some problems with open data systems like Figshare, such as quality control, lack of data curation, no peer review, and content monitoring issues.
Mark Hahnel is the Founder and CEO of Figshare, the all-in-one repository for papers, FAIR data, and nontraditional research outputs. He is passionate about open science and its potential to bring positive change to the research community. Mark has acted as an advisor for the Springer Nature master classes and is currently on the advisory board for the Directory of Open Access Journals (DOAJ). He can be reached on Twitter or LinkedIn.
Missed the previous parts? Listen to Part 1 and Part 2.
Nikesh Gosalia
Since it’s so deep rooted, and interconnected as well, to a lot of the incentives and the rewards that you spoke about, as far as a researcher is concerned, it could be career progression, it could be furthering themselves in terms of a specific research field that they are working into. But just from your perspective again, are there any specific/tangible incentives that can be given to, I don’t know, encourage them to partake in the open science movement?
Mark Hahnel
Science communication and research communication, in getting these messages out there is an important one. I know that if you publish a paper and link it to a dataset in a repository, that, on average, you’ll get 25% more citations to the paper itself. Study done on half a million papers, so it’s not low-end numbers. That for me as a researcher would be an incentive. I’d love 25% more citations on my papers, right? My h-index is not very good. I know that because I work in the field. Not everybody who is working in research knows that because they are busy, because they are focused on that. I would be still focused on that super niche area of stem cell biology. For the communication around these things, there’s a lot of work that goes into it. I think librarians are heroes for doing this. I think there’s a lot of science communication officers within research organizations that do a great job of this, about translating the research to the lay audience so that people can understand things.
I think we hit a side curve recently with all this fake news stuff and all of the anti-vax stuff. For me, it just always comes down to people can get behind the idea that if you have all of the information, you can make an informed decision. If you don’t have all of the information, you can’t make an informed decision. But that’s really talking about what’s good for humanity and not what’s good for the researcher. I think for the researcher, it has to be, you will get rewarded in certain ways by your funder for working in open practices.
We are still in the phase of, in the game of carrots and sticks. I can tell you about the carrots. I can tell you about that citation impact increase. I can tell you, if you publish openly, then more people are going to view your content, download your content, all of these things for open access publishing, proven many times over. But until it comes from the person who pays you, it’s very hard to move the needle on that. And so, as I said, I have mentioned it already today. The NIH is a huge one. The biggest funder of biomedical research in the world is saying, if we fund you, when you publish your papers, you have to publish your data. The White House Office of Science and Technology Policy. As you can hear, I am not American, but America, who is the biggest producer of research in the world, has said that by 2026, every bit of federally funded research, which includes the National Institute of Health, NASA, NOAA, which is the ocean and atmospheric agency.
All of the papers have to be open, and all of the data has to be open, full stop.
I think we are moving to a world where the sticks are going to catch up with these naysayers first, and they are just going to have to do it. Then, we’ll see the flourishing of the rewards and the incentives. I think incentives is always difficult though, because on average, 25% of papers that have datasets attached, get more citations, they get 25% more citations. But not all of them do, right? There’s a bell curve there. Just because I say that you will get 25% more citations, you might get zero more citations, because your research isn’t that interesting. This is this doggy-dog world of academia. I think the only way to truly fix that is to change the structure of how academia works. You have a lot less problems in pharmaceutical companies around good data practices, good publication practices, and they are incentivized to make money. They have the biggest perverse incentive of all. I think if you change the career structure within universities and within academia, that could be the other thing that allows us to move further and faster and doesn’t bring in so many perverse incentives into the structure in which we publish research.
Nikesh Gosalia
With all of these trends, Mark, that we spoke about, like you mentioned, there is the need to maybe publish faster, make it more open. We have probably started in this journey, and we’d love to, of course. We’ll see acceleration with some of the policy changes. But just from your perspective, as far as traditional publishing space is concerned, how do you see that maybe say in five years’ time? We have seen the growth of preprints. That is the increasing use of AI. There are more and more conversations around making it more open, but at the same time, going beyond the journal article. How do we kind of engage with lay audiences? How do you see the traditional publishing space developing over the next five years?
Mark Hahnel
There are a few bits in there. It’s funny, I have just come back from traveling around Asia Pacific with a work colleague, and we were talking about some of this stuff. He was quizzing me on whether I thought in that timeframe, you may have seen, DALL·E, you know, the AI picture drawer? I don’t know what it’s officially called. But apparently, that works for movies. We already have tools that can take a paper and try and make a lay summary out of it. He said, do you think we’ll get explainer videos automatically generated for papers? I was struggling with the concept of, well, these papers are so niche that DALL·E or similar things to it would struggle to get it accurate enough that the researcher wouldn’t be like, well, that’s not technically true. If you’d asked me five years ago, if we could get something like DALL·E today, then I would have just said no, as well. I think it’s always a fantastic thing to be surprised by research. I think we’ll see, particularly in the science communication space, lots more things like that, lots more machine interpretation of information in a way that makes it more accessible to people. I think that’s going to be a cool thing.
There’s going to be in academic publishing little step changes on this. We have a sister organization at Digital Science called ripeta. When I was talking before about fast, good, open publishing, there’s little things that machines can do for you. And so, what they do is they scan the paper and say, just from an ethics standpoint, just from an integrity standpoint, is this an academic paper? Does it have a data availability statement? Does it have an ethics statement? Does it have author affiliations? If you talk about preprints, and you have got some person with ulterior incentives saying, here’s my paper, it’s all true. It’s not peer reviewed. Then, you have the news referring to it, because, “Oh, it’s got a DOI, it’s a published article.” These little tools can say, well, actually, if I run that paper through it, these people weren’t trained in the space and therefore haven’t written a truly academic paper. You can start saying, that’s probably a flag. I think the really interesting part gets into the peer review side of the things. And so, ripeta can say, okay, can we see who the reviewers were? Can we see who the authors were? Can we see if there’s any connections between the people here and start flagging fraud that way?
For me personally, which is quite controversial, I think the biggest thing that needs to change in that space is paid peer review. I argue with the editors a lot about this, who say, it’ll never work, we’ve tried it before all of these things, and I just don’t believe them. I think that the last great shakeup in academic publishing is as soon as somebody solves paid peer review, we will have a lot more transparency with some of these tools to check that you are the right person. But that’s how we’ll get to fast, good, open publishing. For people who say, well, we can’t afford to pay more, I’d say a lot of the value add from academic publishing is peer review. That should be the thing that’s paid for above anything else to begin with. Copyediting is cool. It makes it look pretty. But the machines can help with that in the next 10 years, I am sure.
Nikesh Gosalia
While you are talking about the next 5 to 10 years, Mark, what do you think is the future of Figshare? What would, at a very broad level, be your plans for the next few years? Where would you like to see the company in maybe, say, five years provided, of course, you kind of don’t move on to a new fantastic idea, Figshare kind of developing and growing?
Mark Hahnel
I’ll go and solve paid peer review? That’s a small thing right now. But I am very happy at Figshare. I have been here for a decade, and it’s been great to see it grow.
I think the National Institutes of Health’s GREI Initiative, which is Generalist Repository Ecosystem Initiative, is a great way to fund repositories to work together and to have consistent standards across repositories. And so, for that reason, I don’t care if it’s Figshare, or other systems that are providing this functionality. But I think every university in the world needs something like this. There’s a lot of universities out there. There’s a lot of universities that don’t have a system like this. I think that Figshare could serve a role in making more content openly available.
I think for me, it’s thinking about two factors of two different types of content. That is, if you want a more equitable society and you believe that people have good ideas, no matter where they are, then if you put an academic lens on it, I think a big goal for the academic publishing dissemination community should be, does anybody anywhere have the ability to publish papers or data in fast, good, open way, and to access papers and data? When I say data, I mean nontraditional research outputs, just everything else, all of the files. I think we can serve a great role in doing that. As I said, there are six million outputs on Figshare infrastructure that wouldn’t have been there otherwise. And so, I think we can facilitate aiming towards this, does anybody anywhere have the ability to publish papers and data and access papers and data? For the next 10 years, that’s the big focus for us is, how can we help do that, and obviously, scaling to more universities, working with more publishers, working with more universities can do that.
I think from a technical viewpoint, as I mentioned, before, we have been building this for 10 years. We have some great technology. The colleague I was talking about the DALL·E stuff, we say, we have this joke that Figshare is never going to be finished. You can build products that serve that one role, and they are finished. But as the ecosystem evolves, as there’s more types of content, as there’s more kinds of, we allow you to upload any file format, and we aim to preview it in the browser. But for things like Jupyter Notebooks, or IPython notebooks, it’s very hard to have that executable file run in browser, on a platform like Figshare, without querying some third-party APIs or having very large compute for all of the dependencies. And so, there’s always going to be more stuff we can do to make that better, because there is an unlimited amount of file formats. I’ll be very happy with the tech stack developing on that level and making everything more computational.
Me personally, I think I’d like to just keep encouraging people to do that. I think I am more of a science, or an idea communicator on that level than I am, as you know, the other aspects of running a business. And so, I think I’d like to concentrate more on that side of things going forward for the next 10 years or solve peer review.
Nikesh Gosalia
How do you keep yourself updated about everything that’s happening, maybe in the industry, and even beyond, I mean some of the progress that we are making in technology or open data?
Mark Hahnel
It’s always good to have a curious mind on those things. I think just reading everything and anything you can is always a good thing. But I am lucky, in that I work at an organization that has 12 different companies working in the space who all share a Slack, where people can say, hey, did you see this? Hey, did you see that? But you get the usual parties, it’s very The Scholarly Kitchen, Twitter, I am very active on Twitter and have been for the last 10 years. It’s a very welcoming community in the academic space. But I think following podcasts like this and making sure that you are listening to what’s happening.
I think a lot of people will self-serve, because the way in which you discover content these days is pretty easy. I am still upset when something’s happening, and I don’t have access to it, whether it’s an academic paper, and you have to use illicit methods to get there, or there’s a boxing match on and I haven’t paid for it, kind of thing. I think if you want to consume the content, it’s very easy to find it. Using things like Spotify or using Google feeds and things like this is the way to do it. Or come and work at Digital Science and you get to hear about all these things every day in the Reading Corner Slack channel. But maybe I’ll start tweeting more of those out, so you don’t have to come and work at Digital Science to get that access.
Nikesh Gosalia
What do you feel your personal mission is? What kind of impact do you want to leave on the science community or just world in general?
I know and I have heard you talk in many conferences. You personally are very passionate about this. Like you correctly mentioned, Mark, yes, Figshare is one solution, one way to solve this. But I think you are really passionate about solving it as a community, it doesn’t matter where it comes from.
Mark Hahnel
As I mentioned, at the beginning, I fell into this space. I never intended to come into this space. I liked working in a field where it does have that, you feel like you are adding value. This isn’t a diss to people who are working at games companies or things like because games companies bring people lots of pleasure. But sometimes I see company logos and company kind of mottos and you think that’s a strange thing for that company to have as you know. But everybody can change the world in their own way and in the job that they are doing.
If they have that belief, that’s fantastic.
I think for whatever field you work in, but most notably, the academic setting, which is full of interesting quirks and legacy systems and legacy infrastructure and legacy rules. I think for me, it’s always been to put more value in the new takeout. This is something we had to address very early on as being a company in the private sector. As I said, I didn’t go out looking to become a private company. I had two other conversations at the beginning of Figshare, both of which I didn’t seek out. One with the Wellcome Trust, one with the Sloan Foundation, about whether this could operate as a not for profit. I think if it had have done, it wouldn’t have survived. We wouldn’t have focused on sustainable business model right from the very start. I think that’s a big question for anybody looking to work in this space or who’s got an idea is, regardless of how you set up the structure, just think about the sustainability as the number one priority at all times, and the incentive structures. I think because of that, I got asked very early on. I have had some very direct questions in different countries around the world, saying what are you doing here? What are you as an Englishman doing here telling us what to do with data, right, and flogging your wares? I am like, we always say if it’s a good fit, it’s a good fit. If it’s not, don’t worry. We are doing okay. That message is always, put more value in the new takeout.
I think a life well lived, whether it’s friendships, companies, the academic space, that idea of if you can aim to put more value in the new takeout, then it will be a life well lived.
Nikesh Gosalia
Thank you so much, Mark, for being our guest today. I really enjoyed our conversation today. Obviously, I hope to meet you quite often being in the UK. The fact that you speak at so many events, and a lot of the events are common, perhaps, for all of us to attend.
Mark Hahnel
Thank you! Drinks soon!
Nikesh Gosalia
Thank you everyone for joining us. Stay tuned for our next episode.
All Things SciComm is a weekly podcast brought to you by ScienceTalks. You can subscribe on Apple Podcasts, Spotify, and Google Podcasts.