API SQL Downloads – Functions
This is an experimental feature, and the implementation may change throughout 2024. The feature is currently only available for preview by invited users. Contact helpdesk@gbif.org to request access. |
These custom functions are available when using the SQL download API.
Note the function names are not case-sensitive, gbif_geodistance
, GBIF_GEODISTANCE
and GBIF_GeoDistance
are all equivalent.
Standard functions
Many of the basic functions supported by Apache Hive are supported.
See Hive Operators and User-Defined Functions in the Apache Hive manual.
Filtering functions
These are the same functions used for predicate-based downloads.
Geospatial distance, GBIF_GeoDistance
Function definition:
BOOLEAN GBIF_GeoDistance(DOUBLE centroidLatitude,
DOUBLE centroidLongitude,
STRING distance,
DOUBLE latitude,
DOUBLE longitude)
Parameters:
centroidLatitude
-
The latitude of the centroid, in WGS84 decimal degrees
centroidLongitude
-
The longitude of the centroid, in WGS84 decimal degrees
distance
-
A distance in metres or kilometres including the unit, e.g.
500m
or2.5km
latitude
-
The latitude of the point to filter (usually the
decimalLatitude
field) longitude
-
The longitude of the point to filter (usually the
decimalLongitude
field)
Return value:
This function returns TRUE
if the point is within distance of the centroid. It is equivalent to the geoDistance
predicate.
Within polygon, GBIF_Within
Function definition:
BOOLEAN GBIF_Within(STRING geometryWellKnownText,
DOUBLE latitude,
DOUBLE longitude)
Parameters:
geometryWellKnownText
-
A polygon specified in Well-known text (WKT) format
latitude
-
The latitude of the point to filter (usually the
decimalLatitude
field) longitude
-
The longitude of the point to filter (usually the
decimalLongitude
field)
Return value:
This function returns TRUE
if the point is within the polygon. It is equivalent to the within
predicate, and to the geometry
parameter of the occurrence search API.
Notes:
If you have problems, first check your polygon with Wicket — ensure the points are ordered anti-clockwise. (A polygon with clockwise points represents the opposite: the whole world except the polygon.)
Grid functions
These functions are usually used for generating species occurrence cubes. They calculate a grid cell for a point.
EEA Reference Grid cell code, GBIF_EEARGCode
Function definition:
STRING GBIF_EEARGCode(INTEGER gridSize,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
gridSize
-
The grid size in metres of the EEA Reference Grid. Must be 25, 100, 250, 1000, 10000 or 100000.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
Return value:
The code for the cell of the EEA reference grid in which the randomized point falls.
Example arguments and returned values:
Result |
gridSize |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference: About the EEA reference grid (PDF).
Extended Quarter-Degree Grid cell code, GBIF_EQDGCCode
Function definition:
STRING GBIF_EQDGCCode(INTEGER level,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
level
-
The level of the grid; the number of additional divisions applied to a one-degree cell.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
A level of 0 will give the 1° cell, e.g. S01E010
. For quarter-degrees, use level 2, e.g. S01E010AD
.
Return value:
The code for the cell of the Extended Quarter-Degree Grid in which the randomized point falls.
Example arguments and returned values:
Result |
level |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
See also: Wikipedia: QDGC.
ISEA3H Grid cell code GBIF_ISEA3HCode
This function implements the Inverse Snyder Equal-Area Projection (ISEA) Aperture 3 Hexagonal (3H) Discrete Global Grid System (DGGS), ISEA3H. Grid cell codes/identifiers are as specified in A novel identifier scheme for the ISEA Aperture 3 Hexagon Discrete Global Grid System.
Function definition:
STRING GBIF_ISEA3HCode(INTEGER resolution,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
resolution
-
The resolution of the grid; the number of division steps applied to the initial dodecahedron. Valid values are 1 to 22.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
See ISEA3H details for the number and size of cells at each resolution.
Return value:
The code for the cell of the ISEA3H grid cell in which the randomized point falls.
Example arguments and returned values:
Result |
resolution |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References:
See also:
Military Grid Reference System cell code, GBIF_MGRSCode
Function definition:
STRING GBIF_MGRSCode(INTEGER gridSize,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
gridSize
-
The grid size in metres. Must be 1, 10, 100, 1000, 10000, 100000 or 0.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
A gridSize of 0 will give the Grid Zone Junction (GZJ) only, e.g. 32
. Other values increase the accuracy of the grid, e.g. 100 (metres) 32UNC686615
.
Return value:
The code for the cell of the Military Grid Reference System in which the randomized point falls.
Example arguments and returned values:
Result |
gridSize |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference: Grids and Reference Systems.
See also: Wikipedia: Military Grid Reference System.
Text output functions
These functions are useful when producing text-type output, e.g. tab-delimited files.
Join Array, GBIF_JoinArray
Function definition:
STRING GBIF_JoinArray(ARRAY array,
STRING separator)
Parameters:
array
-
An array
separator
-
A separator to put between the array values, e.g.
|
or `, `
Return value:
This function returns the array elements separated by the separator.
Temporal Uncertainty, GBIF_TemporalUncertainty
Function definition:
STRING GBIF_TemporalUncertainty(String dateInterval)
Parameters:
dateInterval
-
A string containing an ISO-8106 date or date interval.
Return value:
This function returns the uncertainty in seconds of the date.
Example arguments and returned value
dateInterval |
Result |
Remarks |
2021-03-21T15:01:32.456Z |
1 |
Milliseconds are rounded to seconds. |
2021-03-21T15:01:32Z |
1 |
|
2021-03-21T15:01Z |
60 |
|
2021-03-21T15Z |
60×60 |
|
2021-03-21 |
60×60×24 |
|
2021-03-01 |
60×60×24 |
|
2021-01-01 |
60×60×24 |
|
2021-03 |
60×60×24×31 |
|
2021 |
60×60×24×365 |
|
2021-03-21/2021-03-23 |
60×60×24×3 |