What We Learned from Our First Data Science for Open WASH Data Course

Wait 5 sec.

[This article was first published on openwashdata, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. A Global Response to a Critical Need When we launched the first Data Science for Open WASH Data (ds4owd) course in October 2023, 235 WASH professionals from 46 countries around the world signed up—validating our belief that this community is eager to acquire data science skills.Figure 1: World map showing the number of registrations by country.The diverse backgrounds of those who registered told a compelling story about the universal need for better data practices within the WASH sector that can facilitate collaborations between organizations and across countries.Figure 2: Barchart of the organization type registered WASH professionals work for.The Data Reality Check Our pre-course survey revealed exactly what we suspected: WASH professionals are generating valuable data but struggling with sharing and management. Most of the potential participants were storing data in sub-optimal formats for machine-readability1 without any standardized workflows for collaboration or sharing.Figure 3: Barchart of the method used for data storage at the time of registration.Even more telling was the programming experience landscape. While participants brought diverse professional expertise, many (around 38%) were starting their coding journey from scratch. At the same time, we also discovered 13% had already written programs for their own use that were a couple of pages long.Figure 4: Barchart showing the level of programming experience.From Learning to Application Of the 235 initial registrants, 65 participants completed at least half of the modules—a testament to both the course’s rigor and participants’ commitment. But completion rates only tell part of the story.The real success lies in the transformation we measured through our post-course assessments. Participants didn’t just learn tools—they gained confidence in applying them to real-world challenges:72% rated themselves as competent R users 52% felt competent using Git and GitHub for version control 72% expressed confidence applying course skills to real-world projects Figure 5: Self-assessment of R, Git, Github competency and confidence applying sills in real-world data after completing the course.The most exciting outcome? We collaborated with graduates on taking the next step to publish their own data packages using the workflow provided by our CRAN package washr, embracing FAIR principles and contributing to the growing ecosystem of open WASH data—something we will emphasize more during our next iteration.The Impact Beyond Numbers Our capstone projects became showcases of practical application of the course learning. These weren’t just assignments; they are valuable data contributions to the WASH community.One participant noted: ” (What I liked about the course was) the fact that it introduced me to a field (Data Science) that I just used to hear about, little did I know it was valuable in my career path…“What’s Next: ds4owd-002 The success of our first iteration has energized us for round two. ds4owd-002 will run from September to December 2024, building on lessons learned while maintaining the core elements that made the first course successful.Whether you’re managing water quality data in spreadsheets or leading multi-country WASH programs, this course offers a pathway to becoming a more effective, data-savvy professional.The people who signed up for our first course weren’t just registering for learning—they were joining a movement toward more open, collaborative WASH practice. When we share data effectively, we accelerate progress toward universal access to water and sanitation.Ready to transform your data practices? Visit our course page to learn more and sign-up for ds4owd-002 and join our growing community of data-driven WASH professionals.FootnotesMachine-readable data is data structured and formatted in a way that computer programs can automatically process, interpret, and analyze without human intervention, such as CSV, JSON, XML, etc.︎ To leave a comment for the author, please follow the link and comment on their blog: openwashdata.R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.Continue reading: What We Learned from Our First Data Science for Open WASH Data Course