Usually, before performing raster-to-vector conversion, I will binarize the raster.
use GDAL
/**
* TIFF Filter
*
* @param sourceTiff Source TIFF file
* @param targetTiff Target TIFF file
* @param min Minimum value
* @param max Maximum value
* @param nBand Band number
*/
public static void tiffFilter(Path sourceTiff, Path targetTiff, double min, double max, int nBand) {
if (Files.notExists(sourceTiff)) {
throw new ServiceException("File does not exist!");
}
Dataset inputDataset = gdal.Open(sourceTiff.toAbsolutePath().toString(), gdalconst.GA_ReadOnly);
if (inputDataset == null) {
throw new ServiceException("Unable to open raster file: " + gdal.GetLastErrorMsg());
}
// Get basic information about the raster data
int xSize = inputDataset.GetRasterXSize();
int ySize = inputDataset.GetRasterYSize();
// Read the raster data
float[] data = new float[xSize * ySize];
try {
Band band = inputDataset.GetRasterBand(nBand);
band.ReadRaster(0, 0, xSize, ySize, data);
// Apply filter operations, such as setting pixel values below a threshold to 0
for (int i = 0; i < data.length; i++) {
if (data[i] > max || data[i] < min) {
data[i] = 0;
continue;
}
// Set pixel values greater than the threshold to 1
data[i] = 1;
}
} catch (Exception e) {
inputDataset.delete();
throw e;
}
Dataset outputDataset = null;
try {
// Create a new raster dataset
Driver driver = gdal.GetDriverByName(inputDataset.GetDriver().getShortName());
if (driver == null) {
throw new ServiceException("Unable to get GTiff driver: " + gdal.GetLastErrorMsg());
}
outputDataset = driver.Create(targetTiff.toAbsolutePath().toString(), xSize, ySize, 1, gdalconst.GDT_Float64);
if (outputDataset == null) {
throw new ServiceException("Unable to create output raster file: " + gdal.GetLastErrorMsg());
}
// Write the filtered data
Band outputBand = outputDataset.GetRasterBand(nBand);
outputBand.WriteRaster(0, 0, xSize, ySize, data);
outputBand.SetNoDataValue(0);
// Set the georeference information and projection
outputDataset.SetGeoTransform(inputDataset.GetGeoTransform());
outputDataset.SetProjection(inputDataset.GetProjection());
} finally {
// Cleanup
inputDataset.delete();
if (Objects.nonNull(outputDataset)) {
outputDataset.delete();
}
}
if (Files.notExists(targetTiff)) {
throw new ServiceException("Raster filtering failed!");
}
}
GeoTools
In this method, I used JDK 8's parallel stream processing. Although dividing by boundaries saves memory, it is still very slow. If there are better solutions, we can discuss and exchange ideas.
/**
* Converts the grid coverage to a SimpleFeatureCollection.
* This method divides the given grid coverage into multiple small grids, extracts the geometry for each small grid,
* and then merges these geometries into a SimpleFeatureCollection.
*
* @param gridCoverage2D The grid coverage that provides the geographic range and data to be processed.
* @param nodata The value used to fill no-data regions.
* @param handle A functional interface used for further processing of the merged geometries.
* @param row The number of rows to divide.
* @param cols The number of columns to divide.
* @return Returns a SimpleFeatureCollection containing the processed geometries.
*/
public static SimpleFeatureCollection subGrid2SFC(GridCoverage2D gridCoverage2D, double nodata, boolean insideEdges,
int row, int cols, Function<Geometry, Geometry> handle) {
// Divide the grid coverage into smaller regions
List<Envelope2D> envelope2DS = GeoToolsUtils.divideEnvelope(gridCoverage2D.getEnvelope2D(), row, cols);
PolygonExtractionProcess process = new PolygonExtractionProcess();
List<Geometry> rets = new CopyOnWriteArrayList<>();
AtomicInteger atomicInteger = new AtomicInteger();
// Parallel processing of each small grid
envelope2DS.parallelStream()
.forEach(d -> {
// Log the processing progress
if (LOG.isDebugEnabled()) {
int i = atomicInteger.incrementAndGet();
LOG.debug("Current progress of raster to vector conversion: {} out of {}", i, envelope2DS.size());
}
// Convert the boundaries of the small grid into polygons
Polygon polygon = GeoToolsUtils.createPolygon(d);
// Extract data from the grid coverage using the polygon and create a SimpleFeatureCollection
SimpleFeatureCollection ret = process.execute(gridCoverage2D, 0, insideEdges, polygon,
Collections.singleton(nodata), null, null);
if (ret.isEmpty()) {
return;
}
// Convert the SimpleFeatureCollection into a list of geometries
List<Geometry> geometries = GeoToolsUtils.fcToGeometryList(ret);
// Merge the list of geometries into a single geometry
Geometry union = GeoToolsUtils.union(geometries);
if (Objects.isNull(union) || union.isEmpty()) {
return;
}
// Apply additional geometry processing logic
Geometry apply = handle.apply(union);
rets.add(apply);
});
CoordinateReferenceSystem crs = gridCoverage2D.getCoordinateReferenceSystem();
return GeoToolsUtils.geometries2FeatureCollection(rets, "subGrid2SFC", crs);
}
GDAL
Using GDAL is faster and more memory-efficient.
#!/bin/bash
# Check the number of parameters
if [ "$#" -ne 3 ]; then
echo "Usage: $0 <input_tiff> <band> <output_shapefile>"
exit 1
fi
# Parameter 1: Path to the input TIFF file
INPUT_TIFF=$1
BAND=$2
# Parameter 2: Path to the output shapefile
OUTPUT_SHP=$3
# Execute the gdal_polygonize.py command
gdal_polygonize.py "$INPUT_TIFF" -b "$BAND" -f 'ESRI Shapefile' "$OUTPUT_SHP" 1 fid
/**
* 栅格转矢量
*
* @param gridPath 网格路径
* @param targetShapePath 目标形状路径
* @param band 波段
*/
public static void grid2Vector(Path gridPath, Path targetShapePath, int band) {
String gdalCMD = your_shell_path + "/gdal_polygonize.sh %s %s %s";
String cmd = String.format(gdalCMD, gridPath.toAbsolutePath(), band, targetShapePath.toAbsolutePath());
LOG.debug("gdal_polygonize cmd:{}", cmd);
String rests = RuntimeUtil.execForStr("/bin/bash " + cmd);
LOG.debug("gdal 栅格转矢量输出 {}", rests);
if (Files.notExists(targetShapePath)) {
throw new ServiceException("gdal 栅格转矢量失败!");
}
}
Q: How can i only get the black part polygon from the black white tiff?
A: The white areas can be set to 0, the black areas to 1, and the remaining areas as NoData values. When exporting to vector, GDAL will include the corresponding attributes.