Migratory Connectivity in Wetland Birds (Citizen Science + Isotopes)¶

Category: Ornithology · Size: 1.7 GB · Format: ZIP License: CC0-1.0 · Zenodo record · Data sheet on the CSDH

Citizen monitoring and distribution data for three wetland bird species (Sora, Virginia Rails, Yellow Rails) combined with stable isotopes to trace migratory routes across the USA.

The data is mounted read-only at /srv/data/wetland-birds-migration/. Save anything you produce in your personal folder (~/).

⚠️ Large dataset (1.7 GB). Your session has 4 GB RAM and your home folder is shared — don't extract the whole archive. Read the entries you need straight from inside the ZIP (see below); if you must extract, take only specific files, not everything.

What's in the dataset¶

In [1]:
from pathlib import Path

DATA = Path('/srv/data/wetland-birds-migration')

for f in sorted(DATA.rglob('*')):
    if f.is_file():
        print(f"{f.relative_to(DATA)}  ({f.stat().st_size/1e6:,.1f} MB)")
data_files.zip  (108.9 MB)
intermediate_files.zip  (0.5 MB)
s_mrange.zip  (322.4 MB)
scripts.zip  (0.0 MB)
sdm_variables_1.zip  (239.4 MB)
sdm_variables_2.zip  (182.8 MB)
sdm_variables_3.zip  (683.7 MB)
shapefiles.zip  (1.5 MB)
vira_mrange.zip  (116.9 MB)
y_mrange.zip  (63.1 MB)

Explore the ZIP¶

The dataset comes compressed. We list its contents without extracting; if it contains CSVs, pandas can read them straight from inside the ZIP. Remember: /srv/data is read-only — if you need to extract, do it into your folder (~/).

In [2]:
import zipfile
import pandas as pd

zips = sorted(DATA.rglob('*.zip'))
z = zipfile.ZipFile(zips[0])
print('Using:', zips[0].name)
names = z.namelist()
print(f'{len(names)} files inside; first 20:')
for n in names[:20]:
    print('  ', n)

csv_inside = [n for n in names if n.lower().endswith('.csv')]
if csv_inside:
    df = pd.read_csv(z.open(csv_inside[0]), nrows=100_000, low_memory=False)
    display(df.head())
Using: data_files.zip
27 files inside; first 20:
   2015_banding.csv
   20151211_chapter3_nau_isotope_1.csv
   20160225_chapter3_nau_isotope_2.csv
   adc_level0_sora_vira_yera.xlsx
   bsc_gl_rails.csv
   bsc_pp_rails.csv
   bsc_usfws_combined_master.csv
   CAN_adm1.rds
   gsd_northamerica.asc
   jpe12723-sup-0001-AppendixS1.csv
   MEX_adm1.rds
   nau_round_2.csv
   sdmdata_sora.csv
   sdmdata_vira.csv
   sdmdata_yera.csv
   sk_banding.csv
   sk_rails_2014.csv
   sora_sdm_models.Rdata
   sora_top_model_predicted.asc
   stable_isotopes.csv
species prefix bandnumber region area impound wingcord tarsus middletoe culmen tail month day yera age sex
0 yera 1372 18501 nw nvca sanctuary 80.0 29.0 24.0 7.0 27.0 9 24 2015 NaN NaN
1 sora 1372 18503 nw scnwr sgd 104.0 37.0 32.0 21.0 42.0 10 12 2015 NaN NaN
2 sora 1372 18504 nw scnwr sgd 108.0 38.0 31.0 18.0 44.0 10 12 2015 NaN NaN
3 sora 1372 18505 nw scnwr sgd 100.0 38.0 30.0 18.0 43.0 10 12 2015 NaN NaN
4 sora 1372 18506 nw scnwr sgd 102.0 37.0 32.0 17.0 45.0 10 12 2015 NaN NaN

Your turn¶

This is just the starting point. Some ideas:

  • Check the dataset challenge on its CSDH data sheet.
  • Work on a copy: right-click the file → Duplicate (or Save Notebook As…). Your changes only live in your Hub space — they're never pushed to GitHub.
  • Edited this notebook and want the original back? Use the Restore cell below (or the restore.ipynb notebook).
  • Questions and results: on the platform forum.

Attribution: data from Migratory Connectivity in Wetland Birds (Citizen Science + Isotopes), license CC0-1.0. Notebook from the Citizen Science Data Hub (CSDH) — Fundación Ibercivis.

In [3]:
# ⚠️ RESTORE: this DISCARDS YOUR CHANGES to this notebook and resets it to the original.
# 1. Uncomment the line below (remove the #)   2. Run this cell
# 3. Then: menu File → Reload Notebook from Disk

# !git -C ~/citizen-science-data fetch -q origin && git -C ~/citizen-science-data checkout origin/main -- wetland-birds-migration.ipynb && echo "Restored. Now: File → Reload Notebook from Disk"