Species Occurrence Cube SQL Functions
These custom functions have been developed by the B-Cubed project for producing species occurrence cubes.
See Generate a species occurrence cube using Microsoft Azure Databricks for an example of their usage.
Grid functions
These functions are usually used for generating species occurrence cubes. They calculate a grid cell for a point.
EEA Reference Grid cell code
Suggested function declaration:
CREATE OR REPLACE TEMPORARY FUNCTION eeaCellCode AS 'org.gbif.occurrence.hive.udf.EeaCellCodeUDF';
Function definition:
STRING eeaCellCode(INTEGER gridSize,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
gridSize
-
The grid size in metres of the EEA Reference Grid. Must be 25, 100, 250, 1000, 10000 or 100000.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
Return value:
The code for the cell of the EEA reference grid in which the randomized point falls.
Example arguments and returned values:
Result |
gridSize |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference: About the EEA reference grid (PDF).
Extended Quarter-Degree Grid cell code
Suggested function declaration:
CREATE OR REPLACE TEMPORARY FUNCTION eqdgcCode AS 'org.gbif.occurrence.hive.udf.ExtendedQuarterDegreeGridCellCodeUDF';
Function definition:
STRING eqdgcCode(INTEGER level,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
level
-
The level of the grid; the number of additional divisions applied to a one-degree cell.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
A level of 0 will give the 1° cell, e.g. S01E010
. For quarter-degrees, use level 2, e.g. S01E010AD
.
Return value:
The code for the cell of the Extended Quarter-Degree Grid in which the randomized point falls.
Example arguments and returned values:
Result |
level |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
See also: Wikipedia: QDGC.
ISEA3H Grid cell code
This function implements the Inverse Snyder Equal-Area Projection (ISEA) Aperture 3 Hexagonal (3H) Discrete Global Grid System (DGGS), ISEA3H. Grid cell codes/identifiers are as specified in A novel identifier scheme for the ISEA Aperture 3 Hexagon Discrete Global Grid System.
Suggested function declaration:
CREATE OR REPLACE TEMPORARY FUNCTION isea3hCode AS 'org.gbif.occurrence.hive.udf.Isea3hCellCodeUDF';
Function definition:
STRING isea3hCode(INTEGER resolution,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
resolution
-
The resolution of the grid; the number of division steps applied to the initial dodecahedron. Valid values are 1 to 22.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
See ISEA3H details for the number and size of cells at each resolution.
Return value:
The code for the cell of the ISEA3H grid cell in which the randomized point falls.
Example arguments and returned values:
Result |
resolution |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
References:
See also:
Military Grid Reference System cell code
Suggested function declaration:
CREATE OR REPLACE TEMPORARY FUNCTION mgrsCode AS 'org.gbif.occurrence.hive.udf.MilitaryGridReferenceSystemCellCodeUDF';
Function definition:
STRING mgrsCode(INTEGER gridSize,
DOUBLE latitude,
DOUBLE longitude,
DOUBLE coordinateUncertaintyInMeters)
Parameters:
gridSize
-
The grid size in metres. Must be 1, 10, 100, 1000, 10000, 100000 or 0.
latitude
-
The latitude of the point to grid (usually the
decimalLatitude
field) longitude
-
The longitude of the point to grid (usually the
decimalLongitude
field) coordinateUncertaintyInMeters
-
The uncertainty radius to apply to the point (usually used with a default value, for example
COALESCE(coordinateUncertaintyInMeters, 1000)
)
The coordinateUncertaintyInMeters value is used to randomize the point within the circle.
A gridSize of 0 will give the Grid Zone Junction (GZJ) only, e.g. 32
. Other values increase the accuracy of the grid, e.g. 100 (metres) 32UNC686615
.
Return value:
The code for the cell of the Military Grid Reference System in which the randomized point falls.
Example arguments and returned values:
Result |
gridSize |
latitude |
longitude |
coordinateUncertaintyInMeters |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference: Grids and Reference Systems.
See also: Wikipedia: Military Grid Reference System.
Temporal Uncertainty
Suggested function declaration:
CREATE OR REPLACE TEMPORARY FUNCTION temporalUncertainty AS 'org.gbif.occurrence.hive.udf.TemporalUncertaintyUDF';
Function definition:
STRING temporalUncertainty(String dateInterval)
Parameters:
dateInterval
-
A string containing an ISO-8106 date or date interval.
Return value:
This function returns the uncertainty in seconds of the date.
Example arguments and returned value
dateInterval |
Result |
Remarks |
2021-03-21T15:01:32.456Z |
1 |
Milliseconds are rounded to seconds. |
2021-03-21T15:01:32Z |
1 |
|
2021-03-21T15:01Z |
60 |
|
2021-03-21T15Z |
60×60 |
|
2021-03-21 |
60×60×24 |
|
2021-03-01 |
60×60×24 |
|
2021-01-01 |
60×60×24 |
|
2021-03 |
60×60×24×31 |
|
2021 |
60×60×24×365 |
|
2021-03-21/2021-03-23 |
60×60×24×3 |
Software repository
Source code for the functions is available on GitHub.
Compiled code is available from GBIF’s repository; choose the latest version and download cube-functions-<VERSION>.jar
.