This January, we kicked off our 2025 data science training with the Fundamentals in Data Management for Qualitative and Quantitative Arctic Research course in Santa Barbara, CA with seventeen visiting Arctic researchers. All participants work in different disciplines with different focuses and research, but all joined us with one common goal: Enhance their data science skills! This week-long course introduced non-programmers to a wide range of new skills and tools in R Studio, and ended with all participants feeling more confident and comfortable working in the coding environment.

Group photo of visiting researchers and instructors at NCEAS in Santa Barbara, CA.
Both instructors and participants had insightful and enriching experiences through our hands-on sessions and discussions. We focused on a wide array of topics including writing good data management plans, tidy data, working with text data, metadata best practices, cleaning and wrangling data, etc., all of which you can find in our openly accessible learning curriculum. Additionally, we focused on reproducible data management practices and how researchers should consider ethical data collection and data sharing, data sovereignty, and the CARE principles. Collaborators from the Exchange for Local Observations and Knowledge of the Arctic (ELOKA) joined us to complement our ethical data collection session with their insights from their work with Arctic communities.
The seventeen Arctic researchers were visiting us from across the United States, Norway, the United Kingdom, and Canada, from twelve different institutions conducting research across the pan-Arctic. All of their unique backgrounds and research focuses brought different perspectives to our discussions. Below are some thoughts shared from this cohort:
Yueyi Che, a graduate student from Stanford University in the Earth and Planetary Science Department, explained that prior to the course she had little knowledge of programming in R and can’t wait to bring this knowledge back to her lab and department. Che noted, “I learned from a [data curator’s] perspective about what tidy data is and how a well-organized and well-documented dataset can elevate the impact and potential collaborations of my research.” Remarks like these highlight the importance of learning data science fundamentals and applying these practices to research and work conducted across pan-Arctic projects.

Che cruise sampling sea ice and seawater in Antarctica. Photo credit: Yueyi Che.
Building on this, the course is designed to enhance researchers’ confidence in using R, equipping them with hands-on experience they can apply directly to their work. Sarah Principato, a professor from Gettysburg College, echoed this sentiment and emphasized the course’s extreme value in providing a hands-on coding instruction in R and the different resources available at the Arctic Data Center. Dr. Principato plans to integrate the knowledge gained from this week-long course into her classes and research with students, further extending the course’s impact.
Heather Fair, a postdoctoral researcher in the lab of Professor Trinity Hamilton at the University of Minnesota, described the course as “a unique [opportunity she had] not encountered.” She emphasized its broader value, stating, “This type of training is not only valuable for the Arctic research community, but it should also be available for individuals interested in learning how scientists collect, standardize, analyze, and publish data… This training has the potential to bring together groups of decision makers and analysts to focus on solutions by using scientific data and human ingenuity.”

Fair conducting fieldwork on the Mendenhall Glacier with University of Alaska researchers and on the Matanuska Glacier. Photo credit: Heather Fair.
Joey Rotondo, a graduate student in the Atmospheric and Climate Science Department at the University of Washington, noted that “the most valuable aspect of the course for me was learning best practices for ethical and reproducible Arctic data management.”

Rotondo at the 2025 American Meteorological Society Annual Conference in New Orleans, LA. Photo credit: Joey Rotondo.
Additionally, Rotondo said, “This course will directly improve how I manage sea ice concentration and albedo datasets by ensuring they are well-documented, reproducible, and properly archived. Applying structured metadata and ethical data-sharing principles will enhance the accessibility and integrity of my research, making it more valuable to the broader climate science community. The skills I developed in R will also streamline my data processing, improving efficiency in my analyses.”
At the Arctic Data Center, we see the impact of our short courses spread far beyond the five days of in-person learning. At various outreach events, we often reconnect with past participants from our data science training who share positive feedback about their experiences and encourage others to attend future trainings. Welcoming different perspectives and backgrounds allows us to hear directly from the Arctic research community and gather feedback on how we can improve as a data center and better support the community. Jim Regetz, Director of Research Software Engineering at the National Center for Ecological Analysis and Synthesis (NCEAS), taught several technical lessons and noted, “It’s a really fun and rewarding opportunity to meet researchers who work in different places and disciplines, engage in different forms of Arctic research, and face their own unique data management and analysis challenges, but who are excited to learn generally useful and enduring skills for working productively with data – skills that just about any researcher can put to use.”
Many of our participants from this cohort found the course to be an enriching addition to their research experience. Thank you to our NSF Award #2042102 which allowed us to fund the participation of eligible Arctic researchers and the development of this learning curriculum.
To discover more about what we spent learning about during this course, please visit our virtual coursebook which holds all of the learning materials from this rendition of the course.
Written by Angie Garcia
Community Engagement and Outreach Coordinator, Arctic Data Center