Ten reasons to share your data

Making data available to the larger scientific community has many benefits.

27 April 2018

Amanda S. Barnard

COMMENT

TCmake_photo/Getty

Amanda S. Barnard

Physical scientists can be a parsimonious bunch; we guard our data like it’s buried treasure. We spend considerable time seeking the resources to collect the data in the first place; once we have it we are trained to protect it at all costs.

But releasing our data into the wild can make it work for us instead of the other way around. Here’s how:

  1. Ambition – When we start a scientific study there is a competitive advantage to protecting a dataset. Exclusive access ensures you are the first to extract the most exciting discoveries. But when you have published those discoveries, your focus shifts to maximising the impact from them. At this point there is a competitive advantage to making the data available to the community, to increase engagement with your work and encourage readership and citations. Datasets can also be assigned a digital object identifier (DOI), so their uptake and impact can be tracked directly.

  2. Visibility – Sharing your data helps you stand out from the crowd by ensuring more people are aware of your results and your capabilities. Your research may be useful in areas and in ways you have never thought of, but researchers in those fields will not seek you out unless they see what you have to offer. As a rule, the researchers that are most visible (in any community) assume a mantle of leadership, and that could be you.

  3. Veracity – Sharing data encourages a high level of professionalism. When we are the only ones to interact with a dataset, our naming conventions, directory structures and formatting can become inconsistent and abstruse over time. When we share our data we tend to pay closer attention to making sure its structure is consistent, the nomenclature and storage system is logical, and the data curation adheres to international standards. This is one of the principles of the FAIR (Findable, Accessible, Interoperable, Reusable) data movement.

  4. Generosity – We all contribute to our community in a number of ways: We give our time as editors and referees, we give our time organizing meetings and conferences, we give time and resources to collaborators, and we give our networks to facilitate connections between colleagues. Why does our generosity so often disappear when it comes to data? Are some numbers on a page or screen really more valuable than the other things we give freely?

  5. Reciprocity – How many times have you downloaded a script, code, dataset or an individual result from the web? How often in your own research have you drawn upon the results others have published? Naturally, you would offer the same access to your data to other researchers, but responding to individual requests on a case-by-case basis is a very inefficient way to give back to your community. Open-access repositories of data are a more sensible way to manage the process of sharing.

  6. Connectivity – Ever heard the term ‘strength in numbers’? Being a part of a large globally distributed and diverse community has benefits that transcend individual pair-wise collaborations and linkages. For example, many research communities have found that a network of scientists can lobby for support for large infrastructure that individuals could never win. Sharing your data in research community forums can help you tap into a network of people and knowledge, and have an impact on decisions at a global level.

  7. Relevance – Generating, sorting and sharing data with others who need it for their research progresses the entire field. Data that is not actively progressing knowledge in a field rapidly becomes irrelevant. To retain a position as preeminent innovators, driving the invention and adoption of the next generation of technology, we need to ensure we embrace new ways of making this happen, beyond publishing our favourite results. Your data is relevant to the field when it is actively used by others in the field.

  8. Complacency – Ever read a draft so many times that your eyes become comfortable with the words and no longer see the typos? Sometimes we just can’t see them anymore, and the best solution is to ask our colleagues to handle the next revision. Sometimes it’s our minds (not our eyes) that become comfortable with the assumption that we are the experts and nothing will slip our notice. It’s the same with datasets. Sharing our results with a fresh pair of eyes (or many fresh pairs of eyes) ensures quality control, which benefits us as much as anyone else.

  9. Legacy – Tired of working on the same old dataset, but know there is so much more it has to offer? Moving on to something else does not mean your impact in an established field has to wane. Share your data and let others take up the challenge. Gifting the next generation of scientists with your data continues your legacy, building on what you have created, and frees you to work on new things.

  10. Money – The low success rates in funding schemes suggests that there are more great ideas than there is money to support them; but not all of these applications are great ideas. Many grant applications propose work that has already been tried (and funded) before, and found to fail. Sharing all your data, even the things that didn’t make it into your top publications, will save our funding bodies from supporting the same research over and over again. Sharing ‘negative’ results will ultimately free up more funding for genuinely new ideas and new scientific directions. Many funding bodies are also recognising the potential for great data-driven discoveries. Research that contains strong data management and analytics aspects is well regarded; and in many cases, well supported.

Amanda S. Barnard is a science leader at the Office of the Chief Executive Science Leader, Commonwealth Scientific and Industrial Research Organisation in Australia.

Tags:

Research Highlights of partners

Return to 'News'