
# Introduction to Geospatial Raster and Vector Data with Python 18-19th Sep
2024-09-18
Welcome to The Workshop Collaborative Document.
This Document is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents.
----------------------------------------------------------------------------
This is the Document for today: https://notes.crib.utwente.nl/2024-09-18-Geospatial-Python-Workshop-Day1
Collaborative Document day 2: https://notes.crib.utwente.nl/2024-09-18-Geospatial-Python-Workshop-Day2
## Code of Conduct
Participants are expected to follow these guidelines:
* Use welcoming and inclusive language.
* Be respectful of different viewpoints and experiences.
* Gracefully accept constructive criticism.
* Focus on what is best for the community.
* Show courtesy and respect towards other community members.
## License
All content is publicly available under the Creative Commons Attribution License: [creativecommons.org/licenses/by/4.0/](https://creativecommons.org/licenses/by/4.0/).
## Getting help
Contact nearby Helper or put on RED sticker note on your laptop if you need assistance.
## Workshop website
[link](https://itc-crib.github.io/2024-09-18-Geospatial-Python-Workshop/)
## Setup
### Python & Environment:
[link](https://carpentries-incubator.github.io/geospatial-python/#software-setup)
📚 Download files:
- [brpgewaspercelen_definitief_2020_small.gpkg](https://figshare.com/ndownloader/files/37729413)
- [brogmwvolledigeset.zip](https://figshare.com/ndownloader/files/37729416)
- [status_vaarweg.zip](https://figshare.com/ndownloader/files/37729419)
## Instructors
Robert Ohuru, Indupriya Mydur, Adhitya Bhawiyuga
## Helpers
Jay Gohil, Jay Pandya, Pitchaporn (Japan)
## Attendance
Muhammad Jawad ITC-PhD(m.jawad@utwente.)
Abdul Khaleq Al-Qasaily (GeoAI/ Spatial Data Science)
Aamir Imran ITC-PhD
Akshay Anand /PhD-PGM-ITC
Meenal Sharma /ITC
Vidya N. Fikriyah/ITC
Fabian Moßner /M-SE
Nick Muntendam / BSc Applied Mathematics
Jennie Powers / MSc Spatial Engineering
Nanette Kingma / AES Department
Rizki Hanintyo / PhD WRS ITC
Debraj Deka / M-GEO, ITC
Ella Keijser / MSc Spatial Engineering
Mateo Moreno/ AES Department
Yudhi Tanasa / ITC-NRS
Sharvil Upadhye / M-GEO
Danush Raghu / M-GEO
Rakibun Athid/M-SE
Rodolfo Rani / AES
Amina Mohammedshum/ITC-WRS
Annisa Puspa Kirana/polinema
## Agenda
**Day 1**
| Time | Topic |
| ----- | --------------------------------------- |
| 09:30 | Welcome and icebreaker, setup check |
| 09:45 | Introduction to raster, vector, and CRS |
| 10:00 | Access satellite imagery using Python |
| 11:00 | *Coffee break* |
| 11:15 | Read and visualize raster data |
| 12:30 | *Lunch* |
| 13:30 | Vector data in Python |
| 14:30 | *Coffee break* |
| 14:45 | Raster with rioxarray and geopandas |
| 15:45 | *Short break* |
| 16:00 | Raster calculations in Python |
| 16:45 | Wrap-up |
| 17:00 | END of Day 1 |
**Day 2**
| Time | Topic |
| ----- | --------------------------------------- |
| 09:15 | Welcome and icebreaker |
| 09:30 | Calculating Zonal Statistics on Rasters |
| 10:45 | *Coffee break* |
| 11:00 | Parallel raster computations using Dask |
| 12:15 | Wrap-up |
| 12:30 | END |
## Exercises
Search for all the available Sentinel-2 scenes in the sentinel-2-l2a collection that satisfy the following criteria:
- intersect a provided bounding box (use ±0.01 deg in lat/lon from the previously defined point);
- have been recorded between 20 March 2020 and 30 March 2020;
- have a cloud coverage smaller than 10% (hint: use the query input argument of client.search).
How many scenes are available? Save the search results in GeoJSON format.
## Collaborative Notes
Is the environment setup working on your computer?
```sh
conda activate -n geospatial
cd data-folder-path (geospatial-python)
jupyter lab
rename notebook to -> "intro-stac.ipynb"
```
## Introduction to STAC
```python=
# Import required packages
from osgeo import gdal
import rioxarray
from pystac_client import Client
from shapely.geometry import Point
```
```python=
api_url= "https://earth-search.aws.element84.com/v1"
```
```python=
client = Client.open(api_url)
```
```python=
collection = "sentinel-2-l2a"
```
```python=
point = Point(4.89, 52.37)
```
```python=
search = client.search(
collections=[collection],
intersects=point,
max_items=10
)
```
```python=
print(search.matched())
```
```python=
items = search.item_collection()
```
```python=
print(len(items))
```
```python=
for item in items:
print(item)
```
```python=
item = items[0]
print(item.datetime)
```
```python=
print(item.geometry)
```
```python=
print(item.properties)
```
```python=
bbox = point.buffer(0.01).bounds
```
```python=
search = client.search(
collections=[collection],
bbox=bbox,
datetime='2020-03-20/2020-03-30',
query=['eo:cloud_cover<10']
)
```
```python=
print(search.matched())
```
```python=
items = search.item_collection()
```
```python=
items.save_object("search.json")
```
```python=
assets = items[1].assets
```
```python=
print(assets)
```
```python=
print(assets.keys())
```
```python=
print(assets["thumbnail"].href)
```
```python=
nir_href = assets["nir"].href
nir_href
```
```python=
nir = rioxarray.open_rasterio(nir_href)
```
```python=
print(nir)
```
```python=
nir.rio.to_raster("nir.tif")
```
```python=
nir[0:1500, 1500:2200].rio.to_raster("nir_subset.tif")
```
```python=
cm_api_url = "https://cmr.earthdata.nasa.gov/stac/LPCLOUD"
```
```python=
collection = "HLSL30.v2.0"
```
```python=
client = Client.open(cm_api_url)
```
```python=
search = client.search(
collections=[collection],
intersects=Point(-73.97, 40.78),
datetime="2021-02-01/2021-03-30"
)
```
```python=
items = search.item_collection()
print(len(items))
```
```python=
import pystac
```
```python=
items = pystac.ItemCollection.from_file("search.json")
```
```python=
print(len(items))
```
```python=
raster_ams_b9 = rioxarray.open_rasterio(items[0].assets["nir09"].href)
```
```python=
print(raster_ams_b9)
```
```python=
print(raster_ams_b9.rio.nodata)
```
```python=
print(raster_ams_b9.rio.crs)
```
### Visualize raster
```python=
raster_ams_b9.values
```
```python=
raster_ams_b9.plot()
```
```python=
raster_ams_b9.plot(robust=True)
```
```python=
raster_ams_b9.plot(vmin=100, vmax=7000)
```
```python=
print(raster_ams_b9.rio.crs)
```
```python=
print(raster_ams_b9.rio.crs.to_epsg())
```
```python=
from pyproj import crs
```
```python=
epsg_code = raster_ams_b9.rio.crs.to_epsg()
```
```python=
crs = CRS(epsg_code)
```
```python=
crs
```
```python=
print(raster_ams_b9.mean())
```
```python=
raster_ams_overview = rioxarray.open_rasterio(
items[0].assets['visual'].href,
overview_level=3
)
```
```python=
raster_ams_overview
```
```python=
raster_ams_overview.plot.imshow()
```
# Geopandas Part
```python=
import geopandas as gpd
```
```python=
fields = gpd.read_file("data/brpgewaspercelen_definitief_2020_small.gpkg")
fields
```
```python=
# define bounding box
xmin, xmax = (110_000, 140_000)
ymin, ymax = (470_000, 510_000)
bbox = (xmin, ymin, xmax, ymax)
```
```python=
# partially load data within the bounding box
fields = gpd.read_file("data/brpgewaspercelen_definitief_2020_small.gpkg", bbox=bbox)
```
```python=
fields.plot()
```
```python=
fields.type
```
```python=
fields.crs
```
```python=
fields.total_bounds
```
```python=
xmin, xmax = (120_000, 135_000)
ymin, ymax = (485_000, 500_000)
fields_cx = fields.cx[xmin:xmax, ymin:ymax]
```
```python=
fields_cx.to_file("fields_cropped.shp")
```
```python=
fields = gpd.read_file("fields_cropped.shp")
```
```python=
wells = gpd.read_file("data/brogmwvolledigeset.zip")
```
```python=
wells.plot(markersize=0.1)
```
```python=
wells = wells.to_crs(epsg=28992)
```
```python=
wells_clip = wells.clip(fields)
wells_clip
```
```python=
buffer = fields.buffer(50)
fields_buffer = fields.copy()
fields_buffer['geometry'] = buffer
fields_buffer.plot()
```
```python=
fields_buffer_dissolve = fields_buffer.dissolve()
fields_buffer_dissolve
```
```python=
wells_clip_buffer = wells.clip(fields_buffer_dissolve)
wells_clip_buffer.plot()
```
```python=
fields = gpd.read_file("fields_cropped.shp")
wells = gpd.read_file("data/brogmwvolledigeset.zip")
```
```python=
xmin, ymin, xmax, ymax =fields.total_bounds
wells = wells.to_crs(28992)
wells_cx = wells.cx[xmin-500:xmax+500,ymin-500:ymax+500]
```
```python=
wells_cx_500mbuffer = wells_cx.copy()
wells_cx_500mbuffer['geometry']=wells_cx.buffer(500)
```
```python=
fields_clip_buffer = fields.clip(wells_cx_500mbuffer)
fields_clip_buffer.plot()
```
```python=
fields_wells_buffer = fields.sjoin(wells_cx_500mbuffer)
```
```python=
print(fields_wells_buffer.shape)
```
```python=
idx = fields_wells_buffer.index.unique()
fields_in_buffer = fields.iloc[idx]
fields_in_buffer.plot()
```
```python=
waterways_nl = gpd.read_file("data/status_vaarweg.zip")
waterways_nl.plot()
```
```python=
waterways_nl['geometry']
```
```python=
print(waterways_nl['geometry'][2])
```
```python=
print(type(waterways_nl['geometry'][2]))
```
```python=
import shapely
```
```python=
def flip(geometry):
return shapely.ops.transform(lambda x,y: (y,x), geometry)
geom_corrected = waterways_nl['geometry'].apply(flip)
```
```python=
#update geometry
waterways_nl['geometry'] = geom_corrected
waterways_nl.plot()
```
```python=
import pystac
import rioxarray
```
```python=
items = pystac.ItemCollection.from_file("search.json")
```
```python=
raster = rioxarray.open_rasterio(items[1].assets["visual"].href)
print(raster.shape)
```
```python=
raster_overview = rioxarray.open_rasterio(items[1].assets["visual"].href, overview_level=3)
print(raster_overview.shape)
```
```python=
raster_overview.plot.imshow(figsize=(8,8))
```
```python=
from pyproj import CRS
CRS(raster.rio.crs)
```
```python=
fields.crs
```
```python=
fields = fields.to_crs(raster.rio.crs)
```
```python=
#Crop the raster
raster_clip_box = raster.rio.clip_box(*fields.total_bounds)
print(raster_clip_box.shape)
```
```python=
raster_clip_box.plot.imshow(figsize=(8,8))
```
```python=
raster_clip_box.rio.to_raster("raster_clip.tif")
```
```python=
raster_clip_fields = raster_clip_box.rio.clip(fields['geometry'])
```
```python=
raster_clip_fields.plot.imshow(figsize=(8,8))
```
# Exercise: crop raster data with a specific code
#### In the column “gewascode” (translated as “crop code”) of fields, you can find the code representing the types of plants grown in each field. Can you:
##### 1. Select the fields with “gewascode” equal to 257;
##### 2. Crop the raster raster_clip_box with the selected fields;
##### 3. Visualize the cropped image.
```python=
mask = fields['gewascode']==257
fields_gwascode = fields.where(mask)
fields_gwascode = fields_gwascode.dropna()
raster_clip_fields_gwascode = raster_clip_box.rio.clip(fields_gwascode['geometry'])
raster_clip_fields_gwascode.plot.imshow(figsize=(8,8))
```
```python=
# Reproject to RD to make the CRS different from the "raster"
raster_clip_fields_gwascode = raster_clip_fields_gwascode.rio.reproject("EPSG:28992")
CRS(raster_clip_fields_gwascode.rio.crs)
```
```python=
CRS(raster_clip_box.rio.crs)
```
```python=
raster_reproject_match = raster_clip_box.rio.reproject_match(raster_clip_fields_gwascode)
raster_reproject_match.plot.imshow(figsize=(8,8))
```
```python=
raster_reproject_match = raster_clip_fields_gwascode.rio.reproject_match(raster_clip_box)
raster_reproject_match.plot.imshow(figsize=(8,8))
```
```python=
import pystac
```
```python=
items = pystac.ItemCollection.from_file("search.json")
```
```python=
red_uri = items[1].assets["red"].href
nir_uri = items[1].assets["nir08"].href
```
```python=
import rioxarray
```
```python=
red = rioxarray.open_rasterio(red_uri, masked=True)
nir = rioxarray.open_rasterio(nir_uri, masked=True)
```
```python=
bbox = (629_000, 5_804_000, 639_000, 5_814_000)
red_clip = red.rio.clip_box(*bbox)
nir_clip = nir.rio.clip_box(*bbox)
```
```python=
red_clip.plot(robust=True)
```
```python=
print(red_clip.shape, nir_clip.shape)
```
```python=
red_clip_matched = red_clip.rio.reproject_match(nir_clip)
print(red_clip_matched.shape)
```
```python=
ndvi = (nir_clip - red_clip_matched)/(nir_clip + red_clip_matched)
print(ndvi)
```
```python=
ndvi.plot()
```
```python=
ndvi.plot.hist()
```
```python=
print(ndvi.min().values)
print(ndvi.max().values)
```
```python=
ndvi.plot.hist(bins=50)
```
```python=
class_bins = (-1, 0., 0.2, 0.7, 1)
ndvi.plot(levels=class_bins)
```
```python=
ndvi_nonan = ndvi.interpolate_na(dim='x')
ndvi_nonan.rio.to_raster("ndvi.tif")
```
```python=
import numpy as np
import xarray
ndvi_classified = xarray.apply_ufunc(np.digitize, ndvi_nonan, class_bins)
```
```python=
import earthpy.plot as ep
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
ndvi_colors = ["blue", "gray", "green", "darkgreen"]
ndvi_cmaps = ListedColormap(ndvi_colors)
category_names = ["Water", "No Vegetation", "Sparse Vegetation", "Dense Vegetation"]
# we need to know in what order the legend items should be arranged
category_indices = list(range(len(category_names)))
im = ndvi_classified.plot(cmap=ndvi_cmaps, add_colorbar=False)
plt.title("Classified NDVI")
ep.draw_legend(im_ax=im, classes=category_indices, titles= category_names)
plt.savefig("NDVI_classified.png",bbox_inches="tight", dpi = 300)
```
```python=
ndvi_classified.rio.to_raster("NDVI_classified.tif",dtype="int32")
```
```python=
``