mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-27 07:09:27 +08:00
Major improvements to the GeoMaster geospatial science skill: ### New Features - Added Rust geospatial support (GeoRust crates: geo, proj, shapefile, rstar) - Added comprehensive coordinate systems reference documentation - Added troubleshooting guide with common error fixes - Added cloud-native workflows (STAC, Planetary Computer, COG) - Added automatic skill activation configuration ### Reference Documentation - NEW: references/coordinate-systems.md - CRS fundamentals, UTM zones, EPSG codes - NEW: references/troubleshooting.md - Installation fixes, runtime errors, performance tips - UPDATED: references/programming-languages.md - Now covers 8 languages (added Rust) ### Main Skill File - Streamlined SKILL.md from 690 to 362 lines (500-line rule compliance) - Enhanced installation instructions with uv and conda - Added modern cloud-native workflow examples - Added performance optimization tips ### Documentation - NEW: GEOMASTER_IMPROVEMENTS.md - Complete changelog and testing guide - UPDATED: README.md - Highlight new capabilities ### Skill Activation - Created skill-rules.json with 150+ keywords and 50+ intent patterns - Supports file-based and content-based automatic activation The skill now covers 8 programming languages (Python, R, Julia, JavaScript, C++, Java, Go, Rust) with 500+ code examples across 70+ geospatial topics.
366 lines
12 KiB
Markdown
366 lines
12 KiB
Markdown
---
|
|
name: geomaster
|
|
description: Comprehensive geospatial science skill covering remote sensing, GIS, spatial analysis, machine learning for earth observation, and 30+ scientific domains. Supports satellite imagery processing (Sentinel, Landsat, MODIS, SAR, hyperspectral), vector and raster data operations, spatial statistics, point cloud processing, network analysis, cloud-native workflows (STAC, COG, Planetary Computer), and 8 programming languages (Python, R, Julia, JavaScript, C++, Java, Go, Rust) with 500+ code examples. Use for remote sensing workflows, GIS analysis, spatial ML, Earth observation data processing, terrain analysis, hydrological modeling, marine spatial analysis, atmospheric science, and any geospatial computation task.
|
|
license: MIT License
|
|
metadata:
|
|
skill-author: K-Dense Inc.
|
|
---
|
|
|
|
# GeoMaster
|
|
|
|
Comprehensive geospatial science skill covering GIS, remote sensing, spatial analysis, and ML for Earth observation across 70+ topics with 500+ code examples in 8 programming languages.
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
# Core Python stack (conda recommended)
|
|
conda install -c conda-forge gdal rasterio fiona shapely pyproj geopandas
|
|
|
|
# Remote sensing & ML
|
|
uv pip install rsgislib torchgeo earthengine-api
|
|
uv pip install scikit-learn xgboost torch-geometric
|
|
|
|
# Network & visualization
|
|
uv pip install osmnx networkx folium keplergl
|
|
uv pip install cartopy contextily mapclassify
|
|
|
|
# Big data & cloud
|
|
uv pip install xarray rioxarray dask-geopandas
|
|
uv pip install pystac-client planetary-computer
|
|
|
|
# Point clouds
|
|
uv pip install laspy pylas open3d pdal
|
|
|
|
# Databases
|
|
conda install -c conda-forge postgis spatialite
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### NDVI from Sentinel-2
|
|
|
|
```python
|
|
import rasterio
|
|
import numpy as np
|
|
|
|
with rasterio.open('sentinel2.tif') as src:
|
|
red = src.read(4).astype(float) # B04
|
|
nir = src.read(8).astype(float) # B08
|
|
ndvi = (nir - red) / (nir + red + 1e-8)
|
|
ndvi = np.nan_to_num(ndvi, nan=0)
|
|
|
|
profile = src.profile
|
|
profile.update(count=1, dtype=rasterio.float32)
|
|
|
|
with rasterio.open('ndvi.tif', 'w', **profile) as dst:
|
|
dst.write(ndvi.astype(rasterio.float32), 1)
|
|
```
|
|
|
|
### Spatial Analysis with GeoPandas
|
|
|
|
```python
|
|
import geopandas as gpd
|
|
|
|
# Load and ensure same CRS
|
|
zones = gpd.read_file('zones.geojson')
|
|
points = gpd.read_file('points.geojson')
|
|
|
|
if zones.crs != points.crs:
|
|
points = points.to_crs(zones.crs)
|
|
|
|
# Spatial join and statistics
|
|
joined = gpd.sjoin(points, zones, how='inner', predicate='within')
|
|
stats = joined.groupby('zone_id').agg({
|
|
'value': ['count', 'mean', 'std', 'min', 'max']
|
|
}).round(2)
|
|
```
|
|
|
|
### Google Earth Engine Time Series
|
|
|
|
```python
|
|
import ee
|
|
import pandas as pd
|
|
|
|
ee.Initialize(project='your-project')
|
|
roi = ee.Geometry.Point([-122.4, 37.7]).buffer(10000)
|
|
|
|
s2 = (ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')
|
|
.filterBounds(roi)
|
|
.filterDate('2020-01-01', '2023-12-31')
|
|
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 20)))
|
|
|
|
def add_ndvi(img):
|
|
return img.addBands(img.normalizedDifference(['B8', 'B4']).rename('NDVI'))
|
|
|
|
s2_ndvi = s2.map(add_ndvi)
|
|
|
|
def extract_series(image):
|
|
stats = image.reduceRegion(ee.Reducer.mean(), roi.centroid(), scale=10, maxPixels=1e9)
|
|
return ee.Feature(None, {'date': image.date().format('YYYY-MM-dd'), 'ndvi': stats.get('NDVI')})
|
|
|
|
series = s2_ndvi.map(extract_series).getInfo()
|
|
df = pd.DataFrame([f['properties'] for f in series['features']])
|
|
df['date'] = pd.to_datetime(df['date'])
|
|
```
|
|
|
|
## Core Concepts
|
|
|
|
### Data Types
|
|
|
|
| Type | Examples | Libraries |
|
|
|------|----------|-----------|
|
|
| Vector | Shapefile, GeoJSON, GeoPackage | GeoPandas, Fiona, GDAL |
|
|
| Raster | GeoTIFF, NetCDF, COG | Rasterio, Xarray, GDAL |
|
|
| Point Cloud | LAS, LAZ | Laspy, PDAL, Open3D |
|
|
|
|
### Coordinate Systems
|
|
|
|
- **EPSG:4326** (WGS 84) - Geographic, lat/lon, use for storage
|
|
- **EPSG:3857** (Web Mercator) - Web maps only (don't use for area/distance!)
|
|
- **EPSG:326xx/327xx** (UTM) - Metric calculations, <1% distortion per zone
|
|
- Use `gdf.estimate_utm_crs()` for automatic UTM detection
|
|
|
|
```python
|
|
# Always check CRS before operations
|
|
assert gdf1.crs == gdf2.crs, "CRS mismatch!"
|
|
|
|
# For area/distance calculations, use projected CRS
|
|
gdf_metric = gdf.to_crs(gdf.estimate_utm_crs())
|
|
area_sqm = gdf_metric.geometry.area
|
|
```
|
|
|
|
### OGC Standards
|
|
|
|
- **WMS**: Web Map Service - raster maps
|
|
- **WFS**: Web Feature Service - vector data
|
|
- **WCS**: Web Coverage Service - raster coverage
|
|
- **STAC**: Spatiotemporal Asset Catalog - modern metadata
|
|
|
|
## Common Operations
|
|
|
|
### Spectral Indices
|
|
|
|
```python
|
|
def calculate_indices(image_path):
|
|
"""NDVI, EVI, SAVI, NDWI from Sentinel-2."""
|
|
with rasterio.open(image_path) as src:
|
|
B02, B03, B04, B08, B11 = [src.read(i).astype(float) for i in [1,2,3,4,5]]
|
|
|
|
ndvi = (B08 - B04) / (B08 + B04 + 1e-8)
|
|
evi = 2.5 * (B08 - B04) / (B08 + 6*B04 - 7.5*B02 + 1)
|
|
savi = ((B08 - B04) / (B08 + B04 + 0.5)) * 1.5
|
|
ndwi = (B03 - B08) / (B03 + B08 + 1e-8)
|
|
|
|
return {'NDVI': ndvi, 'EVI': evi, 'SAVI': savi, 'NDWI': ndwi}
|
|
```
|
|
|
|
### Vector Operations
|
|
|
|
```python
|
|
# Buffer (use projected CRS!)
|
|
gdf_proj = gdf.to_crs(gdf.estimate_utm_crs())
|
|
gdf['buffer_1km'] = gdf_proj.geometry.buffer(1000)
|
|
|
|
# Spatial relationships
|
|
intersects = gdf[gdf.geometry.intersects(other_geometry)]
|
|
contains = gdf[gdf.geometry.contains(point_geometry)]
|
|
|
|
# Geometric operations
|
|
gdf['centroid'] = gdf.geometry.centroid
|
|
gdf['simplified'] = gdf.geometry.simplify(tolerance=0.001)
|
|
|
|
# Overlay operations
|
|
intersection = gpd.overlay(gdf1, gdf2, how='intersection')
|
|
union = gpd.overlay(gdf1, gdf2, how='union')
|
|
```
|
|
|
|
### Terrain Analysis
|
|
|
|
```python
|
|
def terrain_metrics(dem_path):
|
|
"""Calculate slope, aspect, hillshade from DEM."""
|
|
with rasterio.open(dem_path) as src:
|
|
dem = src.read(1)
|
|
|
|
dy, dx = np.gradient(dem)
|
|
slope = np.arctan(np.sqrt(dx**2 + dy**2)) * 180 / np.pi
|
|
aspect = (90 - np.arctan2(-dy, dx) * 180 / np.pi) % 360
|
|
|
|
# Hillshade
|
|
az_rad, alt_rad = np.radians(315), np.radians(45)
|
|
hillshade = (np.sin(alt_rad) * np.sin(np.radians(slope)) +
|
|
np.cos(alt_rad) * np.cos(np.radians(slope)) *
|
|
np.cos(np.radians(aspect) - az_rad))
|
|
|
|
return slope, aspect, hillshade
|
|
```
|
|
|
|
### Network Analysis
|
|
|
|
```python
|
|
import osmnx as ox
|
|
import networkx as nx
|
|
|
|
# Download and analyze street network
|
|
G = ox.graph_from_place('San Francisco, CA', network_type='drive')
|
|
G = ox.add_edge_speeds(G).add_edge_travel_times(G)
|
|
|
|
# Shortest path
|
|
orig = ox.distance.nearest_nodes(G, -122.4, 37.7)
|
|
dest = ox.distance.nearest_nodes(G, -122.3, 37.8)
|
|
route = nx.shortest_path(G, orig, dest, weight='travel_time')
|
|
```
|
|
|
|
## Image Classification
|
|
|
|
```python
|
|
from sklearn.ensemble import RandomForestClassifier
|
|
import rasterio
|
|
from rasterio.features import rasterize
|
|
|
|
def classify_imagery(raster_path, training_gdf, output_path):
|
|
"""Train RF and classify imagery."""
|
|
with rasterio.open(raster_path) as src:
|
|
image = src.read()
|
|
profile = src.profile
|
|
transform = src.transform
|
|
|
|
# Extract training data
|
|
X_train, y_train = [], []
|
|
for _, row in training_gdf.iterrows():
|
|
mask = rasterize([(row.geometry, 1)],
|
|
out_shape=(profile['height'], profile['width']),
|
|
transform=transform, fill=0, dtype=np.uint8)
|
|
pixels = image[:, mask > 0].T
|
|
X_train.extend(pixels)
|
|
y_train.extend([row['class_id']] * len(pixels))
|
|
|
|
# Train and predict
|
|
rf = RandomForestClassifier(n_estimators=100, max_depth=20, n_jobs=-1)
|
|
rf.fit(X_train, y_train)
|
|
|
|
prediction = rf.predict(image.reshape(image.shape[0], -1).T)
|
|
prediction = prediction.reshape(profile['height'], profile['width'])
|
|
|
|
profile.update(dtype=rasterio.uint8, count=1)
|
|
with rasterio.open(output_path, 'w', **profile) as dst:
|
|
dst.write(prediction.astype(rasterio.uint8), 1)
|
|
|
|
return rf
|
|
```
|
|
|
|
## Modern Cloud-Native Workflows
|
|
|
|
### STAC + Planetary Computer
|
|
|
|
```python
|
|
import pystac_client
|
|
import planetary_computer
|
|
import odc.stac
|
|
|
|
# Search Sentinel-2 via STAC
|
|
catalog = pystac_client.Client.open(
|
|
"https://planetarycomputer.microsoft.com/api/stac/v1",
|
|
modifier=planetary_computer.sign_inplace,
|
|
)
|
|
|
|
search = catalog.search(
|
|
collections=["sentinel-2-l2a"],
|
|
bbox=[-122.5, 37.7, -122.3, 37.9],
|
|
datetime="2023-01-01/2023-12-31",
|
|
query={"eo:cloud_cover": {"lt": 20}},
|
|
)
|
|
|
|
# Load as xarray (cloud-native!)
|
|
data = odc.stac.load(
|
|
list(search.get_items())[:5],
|
|
bands=["B02", "B03", "B04", "B08"],
|
|
crs="EPSG:32610",
|
|
resolution=10,
|
|
)
|
|
|
|
# Calculate NDVI on xarray
|
|
ndvi = (data.B08 - data.B04) / (data.B08 + data.B04)
|
|
```
|
|
|
|
### Cloud-Optimized GeoTIFF (COG)
|
|
|
|
```python
|
|
import rasterio
|
|
from rasterio.session import AWSSession
|
|
|
|
# Read COG directly from cloud (partial reads)
|
|
session = AWSSession(aws_access_key_id=..., aws_secret_access_key=...)
|
|
with rasterio.open('s3://bucket/path.tif', session=session) as src:
|
|
# Read only window of interest
|
|
window = ((1000, 2000), (1000, 2000))
|
|
subset = src.read(1, window=window)
|
|
|
|
# Write COG
|
|
with rasterio.open('output.tif', 'w', **profile,
|
|
tiled=True, blockxsize=256, blockysize=256,
|
|
compress='DEFLATE', predictor=2) as dst:
|
|
dst.write(data)
|
|
|
|
# Validate COG
|
|
from rio_cogeo.cogeo import cog_validate
|
|
cog_validate('output.tif')
|
|
```
|
|
|
|
## Performance Tips
|
|
|
|
```python
|
|
# 1. Spatial indexing (10-100x faster queries)
|
|
gdf.sindex # Auto-created by GeoPandas
|
|
|
|
# 2. Chunk large rasters
|
|
with rasterio.open('large.tif') as src:
|
|
for i, window in src.block_windows(1):
|
|
block = src.read(1, window=window)
|
|
|
|
# 3. Dask for big data
|
|
import dask.array as da
|
|
dask_array = da.from_rasterio('large.tif', chunks=(1, 1024, 1024))
|
|
|
|
# 4. Use Arrow for I/O
|
|
gdf.to_file('output.gpkg', use_arrow=True)
|
|
|
|
# 5. GDAL caching
|
|
from osgeo import gdal
|
|
gdal.SetCacheMax(2**30) # 1GB cache
|
|
|
|
# 6. Parallel processing
|
|
rf = RandomForestClassifier(n_jobs=-1) # All cores
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Always check CRS** before spatial operations
|
|
2. **Use projected CRS** for area/distance calculations
|
|
3. **Validate geometries**: `gdf = gdf[gdf.is_valid]`
|
|
4. **Handle missing data**: `gdf['geometry'] = gdf['geometry'].fillna(None)`
|
|
5. **Use efficient formats**: GeoPackage > Shapefile, Parquet for large data
|
|
6. **Apply cloud masking** to optical imagery
|
|
7. **Preserve lineage** for reproducible research
|
|
8. **Use appropriate resolution** for your analysis scale
|
|
|
|
## Detailed Documentation
|
|
|
|
- **[Coordinate Systems](references/coordinate-systems.md)** - CRS fundamentals, UTM, transformations
|
|
- **[Core Libraries](references/core-libraries.md)** - GDAL, Rasterio, GeoPandas, Shapely
|
|
- **[Remote Sensing](references/remote-sensing.md)** - Satellite missions, spectral indices, SAR
|
|
- **[Machine Learning](references/machine-learning.md)** - Deep learning, CNNs, GNNs for RS
|
|
- **[GIS Software](references/gis-software.md)** - QGIS, ArcGIS, GRASS integration
|
|
- **[Scientific Domains](references/scientific-domains.md)** - Marine, hydrology, agriculture, forestry
|
|
- **[Advanced GIS](references/advanced-gis.md)** - 3D GIS, spatiotemporal, topology
|
|
- **[Big Data](references/big-data.md)** - Distributed processing, GPU acceleration
|
|
- **[Industry Applications](references/industry-applications.md)** - Urban planning, disaster management
|
|
- **[Programming Languages](references/programming-languages.md)** - Python, R, Julia, JS, C++, Java, Go, Rust
|
|
- **[Data Sources](references/data-sources.md)** - Satellite catalogs, APIs
|
|
- **[Troubleshooting](references/troubleshooting.md)** - Common issues, debugging, error reference
|
|
- **[Code Examples](references/code-examples.md)** - 500+ examples
|
|
|
|
---
|
|
|
|
**GeoMaster covers everything from basic GIS operations to advanced remote sensing and machine learning.**
|