The Arctic Data Center (ADC) strongly values feedback from our researchers, and is committed to adapting to the evolving needs of the Arctic research community . Our support has enabled the archiving of 7,404 datasets, totaling 58 TiB of data, within the repository. In the last two years, the center has exponentially increased the amount of archived and published data, and has incorporated ethical data practices into our stored metadata. This significant growth has spurred a new wave of innovation, guiding the center’s priorities to better serve the Arctic research community in this next phase of operations.
In April, the center launched the 2024 Community Survey to gather input from the broader Arctic research community. This initiative aimed to identify areas for improvement in our data science training, available opportunities, data curation, data management tools, and services. Open to all individuals involved and/or engaged in Arctic research activities, the survey collected input from 140 anonymized participants. With these results, our aim is to dive deeper and further understand how the ADC can leverage this information to build more capacity for Arctic communities and enhance our support systems.
Participants of the survey had diverse backgrounds in the Arctic researcher community, representing various career stages, disciplines, and experience. Notably, a majority of participants indicated that they work with data in their research, which highlights the significant role data management plays in their work and underscores the importance of our data services as we support researcher needs. Although a pool of 140 participants provides valuable insights, it represents only a small fraction of the much larger Arctic research community, and we recognize there might be other perspectives and experiences we were not able to gather. Many of the participants who are actively conducting Arctic research have a mid-level or senior level of professional experience within their career (Figure 1.0 and 1.1).
As less than half of survey participants are early career researchers (ECR), results may be skewed more towards the experiences of mid to senior-level researchers (Figure 1.0). This limitation can overshadow the differing challenges or unique needs of ECRs. In an attempt to address these unique challenges, the ADC hosts data science training opportunities for Arctic researchers at all career stages and aims to financially support all participants, which can be especially valuable for ECRs who face cost barriers to attending workshops. We curate our training opportunities for different levels of skills in different programming languages so that participants can learn more about a system like ours and key data management skills. Every year we host about three data science training opportunities for the community to be a part of, and we hope to expand the capacity of attendees in the near future.
The community survey revealed several key insights into the various needs of the Arctic researchers, including data sharing infrastructure, data science training capacity, collaboration, and open access challenges. Many individuals noted barriers to accessing standardized datasets and a frustration with technical challenges, such as handling large datasets, insufficient storage cloud systems, and limited data management support. And these challenges might be more prevalent for researchers who are managing large and complex geospatial and environmental data.
Specifically, 63% of the surveyed participants answered that they handle large datasets (Figure 2.0). The handling of large datasets is of critical importance for the ADC’s data support and infrastructure in the future. In 2022, the ADC held a cumulative total of about 14 TiB of data whereas now it holds 58 TiB of content on the repository. This level of growth is a reflection of the influx of large datasets processed and uploaded by our data curation team who continue to support researchers’ expanding data needs. Based on current research and technology trends, we fully anticipate this demand to continue increasing over time.
As climate change intensifies and the need for geospatial data grows, we are committed to enhancing our suite of tools to streamline Arctic researchers’ data uploads to the repository. Our development team is currently implementing features to support direct uploads of large datasets onto our site, marking the first step towards a more comprehensive suite of tools designed to aid data handling. We aim to add more improvements in the future in both our online submission tool editor and support.
Another theme emerging from our survey is the importance of increased training opportunities, particularly in data curation, management, and ethics. Nearly 60% of survey participants reported little or no familiarity with the CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) Principles, and over 90% of respondents were similarly unfamiliar with the SHARE (Sustainable, Holistic, Accessible, Relevant, and Equitable) Principles. These frameworks are valuable for guiding the research community toward responsible data stewardship (Figure 2.1), and the survey results highlight a greater need for emphasizing these frameworks in research practices for the entire Arctic community. Over the past few years, the ADC has consciously made efforts to address the ethical gaps in Arctic research. We added the ethical research practices section and data sensitivity tags as requirements in our online data submissions editor tool as a means for researchers to consider the ethical concerns that surround open science. Both tools serve as mechanisms to include and consider the CARE principles and Indigenous People in the data that is uploaded to the repository.
As the ADC continues operations, we aim to use the input from this survey and other recent publications from the community, such as Earth Science Data Repositories: Implementing the CARE Principles, as roadmaps for how to innovate our data transparency and handle ethical concerns better in the future. To more familiarize the community with these principles, we recommend online resources involving ethical research practices, CARE principles and working with sensitive data, and SHARE principles for conducting research in the Arctic.
Furthermore, data sharing is a crucial component of open science, promoting transparency, collaboration, and innovation. By depositing data into a trusted repository, researchers can ensure their data is preserved, discoverable, and accessible to a broader audience. However, many participants in the survey noted concerns with the fragmented nature of data repositories and the challenge posed for those who might not be National Science Foundation (NSF) funded Arctic researchers.
Of those surveyed, more than half noted they have not published data onto the ADC. (Figure 3.0). Additionally, 51% of respondents noted only some of their data has been made publicly available, which may partly explain the low publishing rates on the ADC.. Several other factors likely contribute to these statistics including sensitivity restrictions, Indigenous data sovereignty, limited time for data curation, lack of structured data knowledge, government restrictions, and more. Researchers may also encounter challenges like unfamiliarity with the repository and/or other technical challenges. We see this as an opportunity for growth and strive to address it using the feedback we’ve received. With the support of our outreach team, we aim to leverage our connections within the broader Arctic research community to bridge the gap with researchers who are unfamiliar with the repository and to increase visibility of our technology improvements, ensuring we can serve as a valuable resource to those working with data (Figure 3.1).
The ADC is an openly-accessible repository with our data available for anyone to reuse. We continuously strive to make sure our data is more findable, accessible, interoperable, and reusable (FAIR) and support the importance of FAIR in the data archived. In recent years, the ADC has collaborated with various projects and Arctic community members to foster ideas that enhance data integration and harmonization, ultimately creating value-added data products like our portal making systems. A key collaborator, the Permafrost Discovery Gateway (PDG), is a representation of our initiative to harness our cyberinfrastructure for developing powerful interactive visualization tools that researchers, educators, and Arctic community members can use. Moving forward, the ADC aims to further build out our portal making systems to be more sustainable and functional for Arctic researchers to make their data more accessible.
Overall, the responses to our survey have yielded valuable insights into the perspectives and experiences of the Arctic research community, and will help guide the ADC in refining and expanding our cyberinfrastructure to meet researchers’ evolving needs. Effective data preservation remains our core mission, and we are committed to overcoming both technical and nontechnical challenges to support a diverse community with a wide range of needs and priorities. We sincerely thank everyone who contributed to the survey – your specific feedback and recommendations will be used internally to shape our future operations and key goals. As we move forward, the ADC remains committed to empowering researchers in preserving and discovering all NSF-funded Arctic research products, with open access and collaboration as key pillars, both now and in the future.
For any suggestions you would like to share with the ADC please email info@arcticdata.io, and for any other data support services contact support@arcticdata.io.
Written by Angie Garcia
Community Engagement and Outreach Coordinator, Arctic Data Center
With help from the rest of the Arctic Data Center Team