Nuances of Spatial data and operations Apache SOLR

SOLR is a scalable, distributed and powerful search and indexing solution. SOLR supports indexing spatial data and provides fast search capabilities on spatial data.
Unlike conventional spatial data solutions like PostGIS, indexing spatial data in SOLR provides high speed search using bounding box and range queries, commonly used in spatial data exploration and processing.
Indexing points and running bounding box queries is covered extensively in Apache SOLR documentation.

We will discuss some specific use cases here, described but not explained in the documentation.

1. Indexing Fences in SOLR: SOLR provides a convenient and powerful type, location_rpt which is an implementation of solr.SpatialRecursivePrefixTreeFieldType. location_rpt indexes POINT data, consisting of latitude and longitude. Points are sufficient to index locations that need to be monitored, and that can be done by using bbox or geofile queries as in the documentation. A typical use case in geospatial search is drawing fences around areas that need to be monitored. This requires storing, indexing and monitoring two dimensional areas. In order to do that, 2D shapes can be indexed using SpatialRecursivePrefixTreeFieldType (called RPT), by defining a custom type in the schema as below. This requires using spatial4j JtsSpatialContextFactory. JTS jar should be added to SOLR classpath in order to resolve the factory. Once the type is added, data can be indexed into fields defined of type fence_rpt, with 2D shapes defined as POLYGON("x1 y1,x2 y2,x3 y3,x4 y4")

<fieldType name="fence_rpt" class="solr.SpatialRecursivePrefixTreeFieldType" spatialContextFactory="com.spatial4j.core.context.jts.JtsSpatialContextFactory" distErrPct="0.025" maxDistErr="0.000009" units="degrees" />

In order to monitor fences, we can determine which of the indexed fences does a point intersect using fq=geo:Intersects("x y"). A common use case being, monitoring a person's movements and identifying which fences he get in and out of.

2. Indexing non geo data in SOLR: location_rpt type, by default, indexes geographical data containing latitude, longitude. In order to index, search and process non geographic data, RPT type can be defined with geo=false attribute. When indexing non geographic points, world bounds need to be defined. These form the coordinate system for indexing data.
Non Geographic RPT types can be combined with JTS to support indexing 2D shapes.
This is useful in indoor geographic indexing and search. This can also be used, probably not intended, to build custom solutions where shapes need to be indexed and searched upon, for non location based use case too, like image analysis.

<fieldType name="nongeorpt"
class="solr.SpatialRecursivePrefixTreeFieldType"
geo="false"
worldBounds="ENVELOPE(0, 0, 1000, 1000)"
maxDistErr="0.001"
units="degrees" />

P.S: Food for thought: Motion controlled games use point in polygon to determine movement of controller and render the movement on screen. Motion controller applications can leverage indexing of points and polygons in custom reference frame for varied applications like hand movement tracking to move robotic mechanical arms.

Enterprise Live

Search This Blog

Nuances of Spatial data and operations Apache SOLR

Labels

Comments

Post a Comment

Popular posts from this blog

Using JNDI managed JMS objects with Apache CAMEL

Catch hold of that Exception and hide that stacktrace!!!

Container ready spring boot application