What I ended up doing is, when I open the first image, then for each polygon,
- Use the image transform to convert the polygon to pixel coordinates, and find the rectangular bounding box containing the polygon in integer pixel coordinates.
- To figure out which pixels in the bounding box are actually in the polygon, construct shapely Polygons for each pixel and use the
.intersects()
method (if you wanted to only include pixels that are completely inside the polygon, you could use.contains()
). (I wasn’t sure if this would be slow, but it turned out not to be.) - Save the list of coordinate pairs for all pixels in each polygon.
Then for every new image you open, you just read the entire image data and index out the parts for each polygon because you already have the pixel indices.
Code looks approximately like this:
import math
import numpy
import pyproj
import rasterio.mask
from shapely.geometry import Polygon
shape_pixels = None
for filename in image_files:
with rasterio.open(filename) as src:
if shape_pixels is None:
projector = pyproj.Proj(src.crs)
pixelcoord_shapes = [
Polygon(zip(*(~src.transform * numpy.array(projector(*zip(*shape.boundary.coords))))))
for shape in shapes
]
pixels_per_shape = []
for shape in shapes:
xmin = max(0, math.floor(shape.bounds[0]))
ymin = max(0, math.floor(shape.bounds[1]))
xmax = math.ceil(shape.bounds[2])
ymax = math.ceil(shape.bounds[3])
pixel_squares = {}
for j in range(xmin, xmax+1):
for i in range(ymin, ymax+1):
pixel_squares[(i, j)] = Polygon.from_bounds(j, i, j+1, i+1)
pixels_per_shape.append([
coords for (coords, pixel) in pixel_squares.items()
if shape.intersects(pixel)
])
whole_data = src.read()
for pixels in pixels_per_shape:
ivals, jvals = zip(*pixels)
shape_data = whole_data[0, ivals, jvals]
...
CLICK HERE to find out more related problems solutions.