Non-random Temporal Patterns of Citizen Science Biodiversity Recording¶
Category: Biodiversity · Size: 3.0 MB · Format: ZIP License: CC-BY-4.0 · Zenodo record · Data sheet on the CSDH
Data and code to analyse when citizens record biodiversity, identifying non-random temporal patterns in sampling effort and their associated drivers.
The data is mounted read-only at /srv/data/temporal-patterns-biodiversity/.
Save anything you produce in your personal folder (~/).
What's in the dataset¶
from pathlib import Path
DATA = Path('/srv/data/temporal-patterns-biodiversity')
for f in sorted(DATA.rglob('*')):
if f.is_file():
print(f"{f.relative_to(DATA)} ({f.stat().st_size/1e6:,.1f} MB)")
data&code.zip (3.0 MB)
Explore the ZIP¶
The dataset comes compressed. We list its contents without extracting; if it contains CSVs, pandas can read them straight from inside the ZIP. Remember: /srv/data is read-only — if you need to extract, do it into your folder (~/).
import zipfile
import pandas as pd
zips = sorted(DATA.rglob('*.zip'))
z = zipfile.ZipFile(zips[0])
print('Using:', zips[0].name)
names = z.namelist()
print(f'{len(names)} files inside; first 20:')
for n in names[:20]:
print(' ', n)
csv_inside = [n for n in names if n.lower().endswith('.csv')]
if csv_inside:
df = pd.read_csv(z.open(csv_inside[0]), nrows=100_000, low_memory=False)
display(df.head())
Using: data&code.zip 7 files inside; first 20: data&code/ data&code/All_TAXA_10_replicates_with_holidays.csv data&code/descriptor.txt data&code/scp1.R data&code/scp2.R data&code/scp3.R data&code/scp4.R
| ID | ID_subset | latitude | longitude | Species | cellLong | cellLat | date | day | month | ... | Dependent | Weekday | Month | Temperature_K | Temperature | Precipitation | Wind | Snow | Country | Holiday | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6455115 | 6 | 37.040009 | -7.973232 | Pinus pinea | -7.997792 | 37.07940 | 42736 | 1 | 1 | ... | 0 | Sunday | Jan | 284.230652 | 11.080652 | 0.02 | 3.166559 | 0.0 | Portugal | 1 |
| 1 | 8753915 | 9 | 37.038830 | -7.781120 | Pinus pinea | -7.797847 | 37.07940 | 42736 | 1 | 1 | ... | 0 | Sunday | Jan | 284.325470 | 11.175470 | 0.02 | 3.077367 | 0.0 | Portugal | 1 |
| 2 | 27823919 | 8 | 39.935980 | -8.188933 | Quercus suber | -8.197736 | 39.97779 | 42737 | 2 | 1 | ... | 0 | Monday | Jan | 280.926483 | 7.776483 | 9.36 | 6.392357 | 0.0 | Portugal | 0 |
| 3 | 6022549 | 1 | 38.170922 | -7.034370 | Quercus rotundifolia | -6.998069 | 38.17879 | 42737 | 2 | 1 | ... | 0 | Monday | Jan | 284.136963 | 10.986963 | 0.08 | 4.979019 | 0.0 | Portugal | 0 |
| 4 | 20103566 | 4 | 40.023598 | -7.232988 | Quercus rotundifolia | -7.198014 | 39.97779 | 42737 | 2 | 1 | ... | 0 | Monday | Jan | 280.862793 | 7.712793 | 4.40 | 4.119776 | 0.0 | Portugal | 0 |
5 rows × 21 columns
Your turn¶
This is just the starting point. Some ideas:
- Check the dataset challenge on its CSDH data sheet.
- Work on a copy: right-click the file → Duplicate (or Save Notebook As…). Your changes only live in your Hub space — they're never pushed to GitHub.
- Edited this notebook and want the original back? Use the Restore cell
below (or the
restore.ipynbnotebook). - Questions and results: on the platform forum.
Attribution: data from Non-random Temporal Patterns of Citizen Science Biodiversity Recording, license CC-BY-4.0. Notebook from the Citizen Science Data Hub (CSDH) — Fundación Ibercivis.
# ⚠️ RESTORE: this DISCARDS YOUR CHANGES to this notebook and resets it to the original.
# 1. Uncomment the line below (remove the #) 2. Run this cell
# 3. Then: menu File → Reload Notebook from Disk
# !git -C ~/citizen-science-data fetch -q origin && git -C ~/citizen-science-data checkout origin/main -- temporal-patterns-biodiversity.ipynb && echo "Restored. Now: File → Reload Notebook from Disk"