IBM Cloud Docs
SQL functions

SQL functions

IBM Cloud® Data Engine is deprecated. As of 18 February 2024 you can't create new instances, and access to free instances will be removed. Existing Standard plan instances are supported until 18 January 2025. Any instances that still exist on that date will be deleted.

You can use any of the following functions in your query.

!

expr
Logical not.

%

expr1 % expr2

Returns the remainder after expr1/expr2.

Example of an SQL function usage fragment

> SELECT 2 % 1.8

Result value

0.2

Example of an SQL function usage fragment

> SELECT MOD(2, 1.8)

Result value

0.2

&

expr1 & expr2

Returns the result of bitwise AND of expr1 and expr2.

Example of an SQL function usage fragment

> SELECT 3 & 5

Result value

1

*

expr1 * expr2

Returns expr1*expr2.

Example of an SQL function usage fragment

> SELECT 2 * 3

Result value

6

+

expr1 + expr2

Returns expr1+expr2.

Example of an SQL function usage fragment

> SELECT 1 + 2

Result value

3

-

expr1 - expr2

Returns expr1-expr2.

Example of an SQL function usage fragment

> SELECT 2 - 1

Result value

1

/

expr1 / expr2

Returns expr1/expr2. It always performs floating point division.

Example of an SQL function usage fragment

> SELECT 3 / 2```

Result value

1.5

Example of an SQL function usage fragment

> SELECT 2L / 2L

Result value

1.0

<

expr1 < expr2

Returns true if expr1 is less than expr2.

Arguments

expr1, expr2 - The two expressions must be same type or can be cast to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 1 < 2

Result value

true

Example of an SQL function usage fragment

> SELECT 1.1 < '1'

Result value

false

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') < to_date('2009-07-30 04:17:52')

Result value

false

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') < to_date('2009-08-01 04:17:52')

Result value

true

Example of an SQL function usage fragment

> SELECT 1 < NULL

Result value

NULL

<=

expr1 >= expr2

Returns true if expr1 is less than or equal to expr2.

Arguments

expr1, expr2 - The two expressions must be same type or can be cast to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 2 <= 2

Result value

true

Example of an SQL function usage fragment

> SELECT 1.0 <= '1'

Result value

true

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-07-30 04:17:52')

Result value

true

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') <= to_date('2009-08-01 04:17:52')

Result value

true

Example of an SQL function usage fragment

> SELECT 1 <= NULL

Result value

NULL

<=>

expr1 <=> expr2

Returns same result as the EQUAL(=) operator for nonnull operands, but returns true if both are null, false if one of the them is null.

Arguments

expr1, expr2 - The two expressions must be same type or can be cast to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 2 <=>; 2

Result value

true

Example of an SQL function usage fragment

> SELECT 1 <=> '1'

Result value

true

Example of an SQL function usage fragment

> SELECT true <=> NULL

Result value

false

Example of an SQL function usage fragment

> SELECT NULL <=> NULL

Result value

true

=

expr1 = expr2

Returns true if expr1 equals expr2, or false otherwise.

Arguments

expr1, expr2 - The two expressions must be same type or can be cast to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 2 = 2

Result value

true

Example of an SQL function usage fragment

> SELECT 1 = '1'

Result value

true

Example of an SQL function usage fragment

> SELECT true = NULL

Result value

NULL

Example of an SQL function usage fragment

> SELECT NULL = NULL

Result value

NULL

==

expr1 == expr2

Returns true if expr1 equals expr2, or false otherwise.

Arguments

expr1, expr2 - The two expressions must be same type or can be cast to a common type, and must be a type that can be used in equality comparison. Map type is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 2 == 2

Result value

true

Example of an SQL function usage fragment

> SELECT 1 == '1'

Result value

true

Example of an SQL function usage fragment

> SELECT true == NULL

Result value

NULL

Example of an SQL function usage fragment

> SELECT NULL == NULL

Result value

NULL

>

expr1 > expr2

Returns true if expr1 is greater than expr2.

Arguments

expr1, expr2 - the two expressions must be same type or can be cast to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 2 > 1

Result value

true

Example of an SQL function usage fragment

> SELECT 2 > '1.1'

Result value

true

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') > to_date('2009-07-30 04:17:52')

Result value

false

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') > to_date('2009-08-01 04:17:52')

Result value

false

Example of an SQL function usage fragment

> SELECT 1 > NULL

Result value

NULL

>=

expr1 >= expr2

Returns true if expr1 is greater than or equal to expr2.

Arguments

expr1, expr2the two expressions must be same type or can be cast to a common type, and must be a type that can be ordered. For example, map type is not orderable, so it is not supported. For complex types such array and struct, the data types of fields must be orderable.

Example of an SQL function usage fragment

> SELECT 2 >= 1

Result value

true

Example of an SQL function usage fragment

> SELECT 2.0 >= '2.1'

Result value

false

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-07-30 04:17:52')

Result value

true

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52') >= to_date('2009-08-01 04:17:52')

Result value

false

Example of an SQL function usage fragment

> SELECT 1 >= NULL

Result value

NULL

^

expr1 ^ expr2

Returns the result of bitwise exclusive OR of expr1 and expr2.

Example of an SQL function usage fragment

> SELECT 3 ^ 5

Result value

2

abs

abs(expr)

Returns the absolute value of the numeric value.

Example of an SQL function usage fragment

> SELECT abs(-1)

Result value

1

acos

acos(expr)

Returns the inverse cosine (also known as arccosine) of expr if -1<;=expr<=1 or NaN otherwise.

Example of an SQL function usage fragment

> SELECT acos(1)

Result value

0.0

Example of an SQL function usage fragment

> SELECT acos(2)

Result value

NaN

add_months

add_months(start_date, num_months)

Returns the date that is num_months after start_date.

Example of an SQL function usage fragment

> SELECT add_months('2016-08-31', 1)

Result value

2016-09-30

and

expr1 and expr2
Logical AND.

approx_count_distinct

approx_count_distinct(expr[, relativeSD])
Returns the estimated cardinality by HyperLogLog++. relativeSD defines the maximum estimation error allowed.

approx_percentile

approx_percentile(col, percentage [, accuracy])

Returns the approximate percentile value of numeric column col at the indicated percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal that controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the indicated percentage array.

Example of an SQL function usage fragment

> SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100)

Result value

[10.0,10.0,10.0]

Example of an SQL function usage fragment

> SELECT approx_percentile(10.0, 0.5, 100)

Result value

10.0

array

array(expr, ...)

Returns an array with the indicated elements.

Example of an SQL function usage fragment

> SELECT array(1, 2, 3)

Result value

[1,2,3]

array_contains

array_contains(array, value)

Returns true if the array contains the value.

Example of an SQL function usage fragment

> SELECT array_contains(array(1, 2, 3), 2)

Result value

true

array_distinct

array_distinct(array)

Removes duplicate values from the array.

Example of an SQL function usage fragment

> SELECT array_distinct(array(1, 2, 3, null, 3))

Result value

[1,2,3,null]

array_except

array_except(array1, array2)

Returns an array of the elements in array1 but not in array2, without duplicates.

Example of an SQL function usage fragment

> SELECT array_except(array(1, 2, 3), array(1, 3, 5))

Result value

[2]

array_intersect

array_intersect(array1, array2)

Returns an array of the elements in the intersection of array1 and array2, without duplicates.

Example of an SQL function usage fragment

> SELECT array_intersect(array(1, 2, 3), array(1, 3, 5))

Result value

[1,3]

array_overlap

arrays_overlap(a1, a2)

Returns true if a1 contains at least a non-null element present also in a2. If the arrays have no common element and they are both non-empty and either of them contains a null element null is returned, false otherwise.

Example of an SQL function usage fragment

> SELECT arrays_overlap(array(1, 2, 3), array(3, 4, 5))

Result value

true

array_remove

array_remove(array, element)

Remove all elements that equal to element from array.

Example of an SQL function usage fragment

> SELECT array_remove(array(1, 2, 3, null, 3), 3)

Result value

[1,2,null]

array_sort

array_sort(expr, func)

Sorts the input array. If func is omitted, sort in ascending order. The elements of the input array must be orderable. NaN is greater than any non-NaN elements for double/float type. Null elements will be placed at the end of the returned array. Since 3.0.0 this function also sorts and returns the array based on the given comparator function. The comparator will take two arguments representing two elements of the array. It returns a negative integer, 0, or a positive integer as the first element is less than, equal to, or greater than the second element. If the comparator function returns null, the function will fail and raise an error.

Example of an SQL function usage fragment

> SELECT array_sort(array(5, 6, 1), (left, right) -> case when left < right then -1 when left > right then 1 else 0 end);

Result value

[1,5,6]

Example of an SQL function usage fragment

> SELECT array_sort(array('bc', 'ab', 'dc'), (left, right) -> case when left is null and right is null then 0 when left is null then -1 when right is null then 1 when left <    right then 1 when left > right then -1 else 0 end);

Result value

["dc","bc","ab"]

Example of an SQL function usage fragment

> SELECT array_sort(array('b', 'd', null, 'c', 'a'))

Result value

["a","b","c","d",null]

array_union

array_union(array1, array2)

Returns an array of the elements in the union of array1 and array2, without duplicates.

Example of an SQL function usage fragment

> SELECT array_union(array(1, 2, 3), array(1, 3, 5))

Result value

[1,2,3,5]

ascii

ascii(str)

Returns the numeric value of the first character of str.

Example of an SQL function usage fragment

> SELECT ascii('222')

Result value

50

Example of an SQL function usage fragment

> SELECT ascii(2)

Result value

50

asin

asin(expr)

Returns the inverse sine (also known as arcsine) the arc sin of expr if -1<=expr<=1 or NaN otherwise.

Example of an SQL function usage fragment

> SELECT asin(0)

Result value

0.0

Example of an SQL function usage fragment

SELECT asin(2)

Result value

NaN

assert_true

assert_true(expr1 [, expr2])

Returns null if column expr1 is true; throws an exception with the provided error message expr2 otherwise. Useful to check if a cast for a column is successful for every row.

Example of an SQL function usage fragment

> SELECT ifnull(assert_true(cast(col as int) is not null), cast(col as int)) as col_int from mytable

Alternatively, using it in the WHERE clause

SELECT cast(col as int) as col_int
FROM mytable
WHERE assert_true(cast(col as int) is not null) is null

Result value when expr1 is false

SQL execution failed
An assert_true function failed: ''NOT (cast(col#14393 as int) = cast(null as int))' is not true!'. Check your data to see which rows don't pass the function's condition and fix the issue.

Example with a custom error message

> SELECT ifnull(assert_true(cast(col as int) is not null, "col must be a valid integer and not null!"), cast(col as int)) as col_int from mytable

Result value when expr1 is false

SQL execution failed
An assert_true function failed: 'col must be a valid integer and not null!'. Check your data to see which rows don't pass the function's condition and fix the issue.

atan

atan(expr)

Returns the inverse tangent (also known as arctangent).

Example of an SQL function usage fragment

> SELECT atan(0)

Result value

 0.0

atan2

atan2(expr1, expr2)

Returns the angle in radians between the positive x-axis of a plane and the point that is indicated by the coordinates (expr1, expr2).

Example of an SQL function usage fragment

> SELECT atan2(0, 0)

Result value

 0.0

avg

avg(expr)
Returns the mean that is calculated from values of a group.

base64

base64(bin)

Converts the argument from a binary bin to a base 64 string.

Example of an SQL function usage fragment

> SELECT base64('Spark SQL')

Result value

 U3BhcmsgU1FM

bigint

bigint(expr)
Casts the value expr to the target data type bigint.

bin

bin(expr)

Returns the string representation of the long value expr represented in binary.

Example of an SQL function usage fragment

> SELECT bin(13)

Result value

 1101

Example of an SQL function usage fragment

> SELECT bin(-13)

Result value

 1111111111111111111111111111111111111111111111111111111111110011

Example of an SQL function usage fragment

> SELECT bin(13.3)

Result value

 1101

binary

binary(expr)
Casts the value expr to the target data type binary.

bit_length

bit_length(expr)

Returns the bit length of string data or number of bits of binary data.

Example of an SQL function usage fragment

> SELECT bit_length('Spark SQL')

Result value

 72

boolean

boolean(expr)
Casts the value expr to the target data type boolean.

bround

bround(expr, d): Returns expr rounded to d decimal places by using HALF_EVEN rounding mode.

Example of an SQL function usage fragment

> SELECT bround(2.5, 0)

Result value

2.0

cast

cast(expr AS type)

Casts the value expr to the target data type type.

Example of an SQL function usage fragment

> SELECT cast('10' as int)

Result value

 10

cbrt

cbrt(expr)

Returns the cube root of expr.

Example of an SQL function usage fragment

> SELECT cbrt(27.0)

Result value

 3.0

ceil

ceil(expr)

Returns the smallest integer not smaller than expr.

Example of an SQL function usage fragment

> SELECT ceil(-0.1)

Result value

 0

Example of an SQL function usage fragment

> SELECT ceil(5)

Result value

 5

ceiling

ceiling(expr)

Returns the smallest integer not smaller than expr.

Example of an SQL function usage fragment

> SELECT ceiling(-0.1)

Result value

 0

Example of an SQL function usage fragment

> SELECT ceiling(5)

Result value

 5

char

char(expr)

Returns the ASCII character that has the binary equivalent to expr. If n is larger than 256, the result is equivalent to chr(n % 256).

Example of an SQL function usage fragment

> SELECT char(65)

Result value

 A

char_length

char_length(expr)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

Example of an SQL function usage fragment

> SELECT char_length('Spark SQL ')

Result value

 10

Example of an SQL function usage fragment

> SELECT CHAR_LENGTH('Spark SQL ')

Result value

 10

Example of an SQL function usage fragment

> SELECT CHARACTER_LENGTH('Spark SQL ')

Result value

 10

character_length

character_length(expr)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

Example of an SQL function usage fragment

> SELECT character_length('Spark SQL ')

Result value

 10

Example of an SQL function usage fragment

> SELECT CHAR_LENGTH('Spark SQL ')

Result value

 10

Example of an SQL function usage fragment

> SELECT CHARACTER_LENGTH('Spark SQL ')

Result value

 10

chr

chr(expr)

Returns the ASCII character that has the binary equivalent to expr. If n is larger than 256, the result is equivalent to chr(n % 256).

Example of an SQL function usage fragment

> SELECT chr(65)

Result value

 A

coalesce

coalesce(expr1, expr2, ...)

Returns the first nonnull argument if exists. Otherwise, null.

Example of an SQL function usage fragment

> SELECT coalesce(NULL, 1, NULL)

Result value

 1

collect_list

collect_list(expr)
Collects and returns a list of nonunique elements.

collect_set

collect_set(expr)
Collects and returns a set of unique elements.

concat

concat(str1, str2, ..., strN)

Returns the concatenation of str1, str2, ..., strN.

Example of an SQL function usage fragment

> SELECT concat('Spark', 'SQL')

Result value

 SparkSQL

concat_ws

concat_ws(sep, [str | array(str)]+)

Returns the concatenation of the strings that are separated by sep.

Example of an SQL function usage fragment

> SELECT concat_ws(' ', 'Spark', 'SQL')

Result value

 Spark SQL

conv

conv(num, from_base, to_base)

Convert num from from_base to to_base.

Example of an SQL function usage fragment

> SELECT conv('100', 2, 10)

Result value

 4

Example of an SQL function usage fragment

> SELECT conv(-10, 16, -10)

Result value

 -16

corr

corr(expr1, expr2)
Returns Pearson coefficient of correlation between a set of number pairs.

cos

cos(expr)

Returns the cosine of expr.

Example of an SQL function usage fragment

> SELECT cos(0)

Result value

 1.0

cosh

cosh(expr)

Returns the hyperbolic cosine of expr.

Example of an SQL function usage fragment

> SELECT cosh(0)

Result value

 1.0

cot

cot(expr)

Returns the cotangent of expr.

Example of an SQL function usage fragment

> SELECT cot(1)

Result value

 0.6420926159343306

count

count(*)
Returns the total number of retrieved rows, including rows that contain null.
count(expr)
Returns the number of rows for which the supplied expression is nonnull.
count(DISTINCT expr[, expr...])
Returns the number of rows for which the supplied expressions are unique and nonnull.

count_min_sketch

count_min_sketch(col, eps, confidence, seed)
Returns a count-min sketch of a column with the indicated esp, confidence, and seed. The result is an array of bytes, which can be de-serialized to a CountMinSketch before usage. Count-min sketch is a probabilistic data structure that is used for cardinality estimation by using sublinear space.

covar_pop

covar_pop(expr1, expr2)
Returns the population covariance of a set of number pairs.

covar_samp

covar_samp(expr1, expr2)
Returns the sample covariance of a set of number pairs.

crc32

crc32(expr)

Returns a cyclic redundancy check value of the expr as a bigint.

Example of an SQL function usage fragment

> SELECT crc32('Spark')

Result value

 1557323817

cube

cume_dist

cume_dist()
Computes the position of a value relative to all values in the partition.

current_database

current_database()

Returns the current database.

Example of an SQL function usage fragment

> SELECT current_database()

Result value

 default

current_date

current_date()
Returns the current date at the start of query evaluation.

current_timestamp

current_timestamp()
Returns the current timestamp at the start of query evaluation.

date

date(expr)
Casts the value expr to the target data type date.

date_add

date_add(start_date, num_days)

Returns the date that is num_days after start_date.

Example of an SQL function usage fragment

> SELECT date_add('2016-07-30', 1)

Result value

 2016-07-31

date_format

date_format(timestamp, fmt)

Converts timestamp to a value of string in the format specified by the date format fmt.

Example of an SQL function usage fragment

> SELECT date_format('2016-04-08', 'y')

Result value

 2016

Example of an SQL function usage fragment

> SELECT date_format('2016-04-16T15:16:29.000Z', 'y-MM-dd')

Result value

2016-04-16

date_sub

date_sub(start_date, num_days)

Returns the date that is num_days before start_date.

Example of an SQL function usage fragment

> SELECT date_sub('2016-07-30', 1)

Result value

 2016-07-29

date_trunc

date_trunc(fmt, ts)

Returns timestamp ts truncated to the unit specified by the format model fmt. fmt to be one of ["YEAR", "YYYY", "YY", "MON", "MONTH", "MM", "DAY", "DD", "HOUR", "MINUTE", "SECOND", "WEEK", "QUARTER"].

Example of an SQL function usage fragment

> SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359')

Result value

 2015-01-01T00:00:00

Example of an SQL function usage fragment

> SELECT date_trunc('MM', '2015-03-05T09:32:05.359')

Result value

 2015-03-01T00:00:00

Example of an SQL function usage fragment

> SELECT date_trunc('DD', '2015-03-05T09:32:05.359')

Result value

 2015-03-05T00:00:00

Example of an SQL function usage fragment

> SELECT date_trunc('HOUR', '2015-03-05T09:32:05.359')

Result value

 2015-03-05T09:00:00

datediff

datediff(endDate, startDate)

Returns the number of days from startDate to endDate.

Example of an SQL function usage fragment

> SELECT datediff('2009-07-31', '2009-07-30')

Result value

 1

Example of an SQL function usage fragment

> SELECT datediff('2009-07-30', '2009-07-31')

Result value

 -1

day

day(date)

Returns the day of month of the date/timestamp.

Example of an SQL function usage fragment

> SELECT day('2009-07-30')

Result value

30

dayofmonth

dayofmonth(date)

Returns the day of month of the date/timestamp.

Example of an SQL function usage fragment

> SELECT dayofmonth('2009-07-30')

Result value

 30

dayofweek

dayofweek(date)

Returns the day of the week for date/timestamp (1 = Sunday, 2 = Monday, ..., 7 = Saturday).

Example of an SQL function usage fragment

> SELECT dayofweek('2009-07-30')

Result value

 5

dayofyear

dayofyear(date)

Returns the day of year of the date/timestamp.

Example of an SQL function usage fragment

> SELECT dayofyear('2016-04-09')

Result value

100

decimal

decimal(expr)
Casts the value expr to the target data type DECIMAL(10,0), truncating the result if needed. Use CAST(expr AS DECIMAL(p,f)) to specify precision and fractional digits explicitly.

decode

decode(bin, charset)

Decodes the first argument by using the second argument character set.

Example of an SQL function usage fragment

> SELECT decode(encode('abc', 'utf-8'), 'utf-8')

Result value

 abc

degrees

degrees(expr)

Converts radians to degrees.

Example of an SQL function usage fragment

> SELECT degrees(3.141592653589793)

Result value

 180.0

dense_rank

dense_rank()
Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank does not produce gaps in the ranking sequence.

double

double(expr)
Casts the value expr to the target data type double.

e

e()

Returns Euler's number, e.

Example of an SQL function usage fragment

> SELECT e()

Result value

2.718281828459045

elt

elt(n, input1, input2, ...)

Returns the n-th input, for example, returns input2 when n is 2.

Example of an SQL function usage fragment

> SELECT elt(1, 'scala', 'java')

Result value

 scala

encode

encode(str, charset)

Encodes the first argument by using the second argument character set.

Example of an SQL function usage fragment

> SELECT encode('abc', 'utf-8')

Result value

 abc

exp

exp(expr)

Returns e to the power of expr.

Example of an SQL function usage fragment

> SELECT exp(0)

Result value

 1.0

explode

explode(expr)

Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.

Example of an SQL function usage fragment

> SELECT explode(array(10, 20))

Result value

 10  20

explode_outer

explode_outer(expr)

Separates the elements of array expr into multiple rows, or the elements of map expr into multiple rows and columns.

Example of an SQL function usage fragment

> SELECT explode_outer(array(10, 20))

Result value

 10  20

expm1

expm1(expr)

Returns exp(expr) - 1.

Example of an SQL function usage fragment

> SELECT expm1(0)

Result value

 0.0

factorial

factorial(expr)

Returns the factorial of expr. expr is [0..20]. Otherwise, null.

Example of an SQL function usage fragment

> SELECT factorial(5)

Result value

 120

find_in_set

find_in_set(str, str_array)

Returns the index (1-based) of the indicated string (str) in the comma-delimited list (str_array). Returns 0, if the string was not found or if the indicated string (str) contains a comma.

Example of an SQL function usage fragment

> SELECT find_in_set('ab','abc,b,ab,c,def')

Result value

 3

first

first(expr[, isIgnoreNull])
Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only nonnull values.

first_value

first_value(expr[, isIgnoreNull])
Returns the first value of expr for a group of rows. If isIgnoreNull is true, returns only nonnull values.

float

float(expr)
Casts the value expr to the target data type float.

floor

floor(expr)

Returns the largest integer not greater than expr.

Example of an SQL function usage fragment

> SELECT floor(-0.1)

Result value

 -1

Example of an SQL function usage fragment

> SELECT floor(5)

Result value

 5

format_number

format_number(expr1, expr2)

Formats the number expr1 like '#,###,###.##', rounded to expr2 decimal places. If expr2 is 0, the result has no decimal point or fractional part. This function is supposed to work like MySQL's FORMAT.

Example of an SQL function usage fragment

> SELECT format_number(12332.123456, 4)

Result value

 12,332.1235

format_string

format_string(strfmt, obj, ...)

Returns a formatted string from printf-style format strings.

Example of an SQL function usage fragment

> SELECT format_string("Hello World %d %s", 100, "days")

Result value

 Hello World 100 days

from_json

from_json(jsonStr, schema[, options])

Returns a struct value with the indicated jsonStr and schema.

Example of an SQL function usage fragment

> SELECT from_json('{"a":1, "b":0.8}', 'a INT, b DOUBLE')

Result value

 {"a":1, "b":0.8}

Example of an SQL function usage fragment

> SELECT from_json('{"time":"26/08/2015"}', 'time Timestamp', map('timestampFormat', 'dd/MM/yyyy'))

Result value

 {"time":"2015-08-26 00:00:00.0"}

from_unixtime

from_unixtime(unix_time, format)

Returns unix_time in the specified format.

Example of an SQL function usage fragment

> SELECT from_unixtime(0, 'yyyy-MM-dd HH:mm:ss')

Result value

 1970-01-01 00:00:00

from_utc_timestamp

from_utc_timestamp(timestamp, timezone)

Given a timestamp, such as '2017-07-14 02:40:00.0', this function interprets it as a time in Coordinated Universal Time (UTC), and renders that time as a timestamp in the indicated time zone. For example, 'GMT+1' would yield '2017-07-14 03:40:00.0'.

Example of an SQL function usage fragment

> SELECT from_utc_timestamp('2016-08-31', 'Asia/Seoul')

Result value

 2016-08-31 09:00:00

get_json_object

get_json_object(json_txt, path)

Extracts a JSON object from path.

Example of an SQL function usage fragment

> SELECT get_json_object('{"a":"b"}', '$.a')

Result value

 b

greatest

greatest(expr, ...)

Returns the greatest value of all parameters, skipping null values.

Example of an SQL function usage fragment

> SELECT greatest(10, 9, 2, 4, 3)

Result value

 10

grouping

grouping_id

hash

hash(expr1, expr2, ...)

Returns a hash value of the arguments.

Example of an SQL function usage fragment

> SELECT hash('Spark', array(123), 2)

Result value

 -1321691492

hex

hex(expr)

Converts expr to hexadecimal.

Example of an SQL function usage fragment

> SELECT hex(17)

Result value

 11

Example of an SQL function usage fragment

> SELECT hex('Spark SQL')

Result value

 537061726B2053514C

hour

hour(timestamp)

Returns the hour component of the string/timestamp.

Example of an SQL function usage fragment

> SELECT hour('2009-07-30 12:58:59')

Result value

 12

hypot

hypot(expr1, expr2)

Returns sqrt(expr1**2 + expr2**2).

Example of an SQL function usage fragment

> SELECT hypot(3, 4)

Result value

 5.0

if

if(expr1, expr2, expr3)

If expr1 evaluates to true, then returns expr2; otherwise, returns expr3.

Example of an SQL function usage fragment

> SELECT if(1 < 2, 'a', 'b')

Result value

 a

ifnull

ifnull(expr1, expr2)

Returns expr2 if expr1 is null, or expr1 otherwise.

Example of an SQL function usage fragment

> SELECT ifnull(NULL, array('2'))

Result value

 ["2"]

in

expr1 in(expr2, expr3, ...)

Returns true if expr equals to any valN.

Arguments

expr1, expr2, expr3, ... - the arguments must be same type.

Example of an SQL function usage fragment

> SELECT 1 in(1, 2, 3)

Result value

 true

Example of an SQL function usage fragment

> SELECT 1 in(2, 3, 4)

Result value

 false

Example of an SQL function usage fragment

> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 1), named_struct('a', 1, 'b', 3))

Result value

 false

Example of an SQL function usage fragment

> SELECT named_struct('a', 1, 'b', 2) in(named_struct('a', 1, 'b', 2), named_struct('a', 1, 'b', 3))

Result value

 true

initcap

initcap(str)

Returns str with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.

Example of an SQL function usage fragment

> SELECT initcap('sPark sql')

Result value

 Spark Sql

inline

inline(expr)

Explodes an array of structs into a table.

Example of an SQL function usage fragment

> SELECT inline(array(struct(1, 'a'), struct(2, 'b')))

Result value

 1  a   2  b

inline_outer

inline_outer(expr)

Explodes an array of structs into a table.

Example of an SQL function usage fragment

> SELECT inline_outer(array(struct(1, 'a'), struct(2, 'b')))

Result value

 1  a   2  b

input_file_name

input_file_name()
Returns the name of the file that is being read, or empty string if not available.

instr

instr(str, substr)

Returns the (1-based) index of the first occurrence of substr in str.

Example of an SQL function usage fragment

> SELECT instr('SparkSQL', 'SQL')

Result value

 6

int

int(expr)
Casts the value expr to the target data type int.

isnan

isnan(expr)

Returns true if expr is NaN, or false otherwise.

Example of an SQL function usage fragment

> SELECT isnan(cast('NaN' as double))

Result value

 true

isnotnull

isnotnull(expr)

Returns true if expr is not null, or false otherwise.

Example of an SQL function usage fragment

> SELECT isnotnull(1)

Result value

 true

isnull

isnull(expr)

Returns true if expr is null, or false otherwise.

Example of an SQL function usage fragment

> SELECT isnull(1)

Result value

 false

json_tuple

json_tuple(jsonStr, p1, p2, ..., pn)

Returns a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.

Example of an SQL function usage fragment

> SELECT json_tuple('{"a":1, "b":2}', 'a', 'b')

Result value

 1  2

kurtosis

kurtosis(expr)
Returns the kurtosis value that is calculated from values of a group.

lag

lag(input[, offset[, default]])
Returns the value of input at the offsetth row before the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offsetth row is null, null is returned. If no such offset row exists (for example, when the offset is 1, the first row of the window does not have any previous row), default is returned.

last

last(expr[, isIgnoreNull])
Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only nonnull values.

last_day

last_day(date)

Returns the last day of the month that the date belongs to.

Example of an SQL function usage fragment

> SELECT last_day('2009-01-12')

Result value

 2009-01-31

last_value

last_value(expr[, isIgnoreNull])
Returns the last value of expr for a group of rows. If isIgnoreNull is true, returns only nonnull values.

lcase

lcase(str)

Returns str with all characters that are changed to lowercase.

Example of an SQL function usage fragment

> SELECT lcase('SparkSql')

Result value

 sparksql

lead

lead(input[, offset[, default]])
Returns the value of input at the offsetth row after the current row in the window. The default value of offset is 1 and the default value of default is null. If the value of input at the offsetth row is null, null is returned. If no such offset row exists (for example, when the offset is 1, the last row of the window does not have any subsequent row), default is returned.

least

least(expr, ...)

Returns the least value of all parameters, skipping null values.

Example of an SQL function usage fragment

> SELECT least(10, 9, 2, 4, 3)

Result value

 2

left

left(str, len)

Returns the leftmost len(len can be string type) characters from the string str, if len is less or equal than 0 the result is an empty string.

Example of an SQL function usage fragment

> SELECT left('Spark SQL', 3)

Result value

 Spa

length

length(expr)

Returns the character length of string data or number of bytes of binary data. The length of string data includes the trailing spaces. The length of binary data includes binary zeros.

Example of an SQL function usage fragment

> SELECT length('Spark SQL ')

Result value

 10

Example of an SQL function usage fragment

> SELECT CHAR_LENGTH('Spark SQL ')

Result value

 10

Example of an SQL function usage fragment

> SELECT CHARACTER_LENGTH('Spark SQL ')

Result value

 10

levenshtein

levenshtein(str1, str2)

Returns the Levenshtein distance between the two indicated strings.

Example of an SQL function usage fragment

> SELECT levenshtein('kitten', 'sitting')

Result value

 3

like

str like pattern

Returns true if str matches pattern, null if any arguments are null, otherwise, false.

Arguments

  • str - A string expression.
  • pattern - A string expression. The pattern is a string that is matched literally, with exception to the following special symbols:

_ matches any one character in the input (similar to . in posix regular expressions)

% matches zero or more characters in the input (similar to .* in posix regular expressions)

The escape character is \. If an escape character precedes a special symbol or another escape character, the following character is matched literally. It is invalid to escape any other character.

String literals are unescaped in our SQL parser. For example, to match \abc, the pattern is \\abc.

Example of an SQL function usage fragment

> SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%'

Result value

true

Note:

Use RLIKE to match with standard regular expressions.

ln

ln(expr)

Returns the natural logarithm (base e) of expr.

Example of an SQL function usage fragment

> SELECT ln(1)

Result value

 0.0

locate

locate(substr, str[, pos])

Returns the position of the first occurrence of substr in str after position pos. The indicated pos and return value are 1-based.

Example of an SQL function usage fragment

> SELECT locate('bar', 'foobarbar')

Result value

 4

Example of an SQL function usage fragment

> SELECT locate('bar', 'foobarbar', 5)

Result value

 7

Example of an SQL function usage fragment

> SELECT POSITION('bar' IN 'foobarbar')

Result value

 4

log

log(base, expr)

Returns the logarithm of expr with base.

Example of an SQL function usage fragment

> SELECT log(10, 100)

Result value

 2.0

log10

log10(expr)

Returns the logarithm of expr with base 10.

Example of an SQL function usage fragment

> SELECT log10(10)

Result value

 1.0

log1p

log1p(expr)

Returns log(1 + expr).

Example of an SQL function usage fragment u

> SELECT log1p(0)

Result value

 0.0

log2

log2(expr)

Returns the logarithm of expr with base 2.

Example of an SQL function usage fragment

> SELECT log2(2)

Result value

 1.0

lower

lower(str)

Returns str with all characters that are changed to lowercase.

Example of an SQL function usage fragment

> SELECT lower('SparkSql')

Result value

 sparksql

lpad

lpad(str, len, pad)

Returns str, left-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

Example of an SQL function usage fragment

> SELECT lpad('hi', 5, '??')

Result value

 ???hi

Example of an SQL function usage fragment

> SELECT lpad('hi', 1, '??')

Result value

 h

ltrim

ltrim(str)

Removes the leading space characters from str.

ltrim(trimStr, str)

Removes the leading string contains the characters from the trim string.

Arguments

  • str - A string expression.
  • trimStr - The trim string characters to trim, the default value is a single space.

Example of an SQL function usage fragment

> SELECT ltrim('    SparkSQL   ')

Result value

 SparkSQL

Example of an SQL function usage fragment

> SELECT ltrim('Sp', 'SSparkSQLS')

Result value

 arkSQLS

map

map(key0, value0, key1, value1, ...)

Creates a map with the indicated key/value pairs.

Example of an SQL function usage fragment

> SELECT map(1.0, '2', 3.0, '4')

Result value

 {1.0:"2",3.0:"4"}

map_keys

map_keys(map)

Returns an unordered array that contains the keys of the map.

Example of an SQL function usage fragment

> SELECT map_keys(map(1, 'a', 2, 'b'))

Result value

 [1,2]

map_values

map_values(map)

Returns an unordered array that contains the values of the map.

Example of an SQL function usage fragment

> SELECT map_values(map(1, 'a', 2, 'b'))

Result value

 ["a","b"]

max

max(expr)
Returns the maximum value of expr.

md5

md5(expr)

Returns an MD5 128-bit checksum as a hex string of expr.

Example of an SQL function usage fragment

> SELECT md5('Spark')

Result value

 8cde774d6f7333752ed72cacddb05126

mean

mean(expr)
Returns the mean that is calculated from values of a group.

min

min(expr)
Returns the minimum value of expr.

minute

minute(timestamp)

Returns the minute component of the string/timestamp.

Example of an SQL function usage fragment

> SELECT minute('2009-07-30 12:58:59')

Result value

 58

mod

expr1 mod expr2

Returns the remainder after expr1/expr2.

Example of an SQL function usage fragment

> SELECT 2 mod 1.8;

Result value

 0.2

Example of an SQL function usage fragment

> SELECT MOD(2, 1.8)

Result value

 0.2

monotonically_increasing_id

monotonically_increasing_id()
Returns monotonically increasing 64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.

month

month(date)

Returns the month component of the date/timestamp.

Example of an SQL function usage fragment

> SELECT month('2016-07-30')

Result value

 7

months_between

months_between(timestamp1, timestamp2)

Returns number of months between timestamp1 and timestamp2.

Example of an SQL function usage fragment

> SELECT months_between('1997-02-28 10:30:00', '1996-10-30')

Result value

 3.94959677

named_struct

named_struct(name1, val1, name2, val2, ...)

Creates a struct with the indicated field names and values.

Example of an SQL function usage fragment

> SELECT named_struct("a", 1, "b", 2, "c", 3)

Result value

 {"a":1,"b":2,"c":3}

nanvl

nanvl(expr1, expr2)

Returns expr1 if it's not NaN, or expr2 otherwise.

Example of an SQL function usage fragment

> SELECT nanvl(cast('NaN' as double), 123)

Result value

 123.0

negative

negative(expr)

Returns the negated value of expr.

Example of an SQL function usage fragment

> SELECT negative(1)

Result value

 -1

next_day

next_day(start_date, day_of_week)

Returns the first date that is later than start_date and named as indicated.

Example of an SQL function usage fragment

> SELECT next_day('2015-01-14', 'TU')

Result value

 2015-01-20

not

not expr: Logical not.

now

now()
Returns the current timestamp at the start of query evaluation.

ntile

ntile(n)
Divides the rows for each window partition into n buckets that range from 1 to at most n.

nullif

nullif(expr1, expr2)

Returns null if expr1 equals to expr2, or expr1 otherwise.

Example of an SQL function usage fragment

> SELECT nullif(2, 2)

Result value

 NULL

nvl

nvl(expr1, expr2)

Returns expr2 if expr1 is null, or expr1 otherwise.

Example of an SQL function usage fragment

> SELECT nvl(NULL, array('2'))

Result value

 ["2"]

nvl2

nvl2(expr1, expr2, expr3)

Returns expr2 if expr1 is not null, or expr3 otherwise.

Example of an SQL function usage fragment

> SELECT nvl2(NULL, 2, 1)

Result value

 1

octet_length

octet_length(expr)

Returns the byte length of string data or number of bytes of binary data.

Example of an SQL function usage fragment

> SELECT octet_length('Spark SQL')

Result value

 9

or

expr1 or expr2
Logical OR.

parse_url

parse_url(url, partToExtract[, key])

Extracts a part from a URL.

Example of an SQL function usage fragment

> SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST')

Result value

 spark.apache.org

Example of an SQL function usage fragment

> SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY')

Result value

 query=1

Example of an SQL function usage fragment

> SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY', 'query')

Result value

 1

percent_rank

percent_rank()
Computes the percentage ranking of a value in a group of values.

percentile

percentile(col, percentage [, frequency])
Returns the exact percentile value of numeric column col at the indicated percentage. The value of percentage must be between 0.0 and 1.0. The value of frequency must be a positive integral.
percentile(col, array(percentage1 [, percentage2]...) [, frequency])
Returns the exact percentile value array of numeric column col at the indicated percentage. Each value of the percentage array must be between 0.0 and 1.0. The value of frequency must be a positive integral.

percentile_approx

percentile_approx(col, percentage [, accuracy])

Returns the approximate percentile value of numeric column col at the indicated percentage. The value of percentage must be between 0.0 and 1.0. The accuracy parameter (default: 10000) is a positive numeric literal that controls approximation accuracy at the cost of memory. Higher value of accuracy yields better accuracy, 1.0/accuracy is the relative error of the approximation. When percentage is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col at the indicated percentage array.

Example of an SQL function usage fragment

> SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100)

Result value

 [10.0,10.0,10.0]

Example of an SQL function usage fragment

> SELECT percentile_approx(10.0, 0.5, 100)

Result value

 10.0

pi

pi()

Returns pi.

Example of an SQL function usage fragment

> SELECT pi()

Result value

 3.141592653589793

pmod

pmod(expr1, expr2)

Returns the positive value of expr1 mod expr2.

Example of an SQL function usage fragment

> SELECT pmod(10, 3)

Result value

 1

Example of an SQL function usage fragment

> SELECT pmod(-10, 3)

Result value

 2

posexplode

posexplode(expr)

Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions.

Example of an SQL function usage fragment

> SELECT posexplode(array(10,20))

Result value

 0  10  1  20

posexplode_outer

posexplode_outer(expr)

Separates the elements of array expr into multiple rows with positions, or the elements of map expr into multiple rows and columns with positions.

Example of an SQL function usage fragment

> SELECT posexplode_outer(array(10,20))
0  10
1  20

position

position(substr, str[, pos])

Returns the position of the first occurrence of substr in str after position pos. The indicated pos and return value are 1-based.

Example of an SQL function usage fragment

> SELECT position('bar', 'foobarbar')

Result value

 4

Example of an SQL function usage fragment

> SELECT position('bar', 'foobarbar', 5)

Result value

 7

Example of an SQL function usage fragment

> SELECT POSITION('bar' IN 'foobarbar')

Result value

 4

positive

positive(expr)
Returns the value of expr.

pow

pow(expr1, expr2)

Raises expr1 to the power of expr2.

Example of an SQL function usage fragment

> SELECT pow(2, 3)

Result value

 8.0

power

power(expr1, expr2)

Raises expr1 to the power of expr2.

Example of an SQL function usage fragment

> SELECT power(2, 3)

Result value

 8.0

printf

printf(strfmt, obj, ...)

Returns a formatted string from printf-style format strings.

Example of an SQL function usage fragment

> SELECT printf("Hello World %d %s", 100, "days")

Result value

 Hello World 100 days

quarter

quarter(date)

Returns the quarter of the year for date, in the range 1 - 4.

Example of an SQL function usage fragment

> SELECT quarter('2016-08-31')

Result value

 3

radians

radians(expr)

Converts degrees to radians.

Example of an SQL function usage fragment

> SELECT radians(180)

Result value

 3.141592653589793

rand

rand([seed])

Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).

Example of an SQL function usage fragment

> SELECT rand()

Result value

 0.9629742951434543

Example of an SQL function usage fragment

> SELECT rand(0)

Result value

 0.8446490682263027

Example of an SQL function usage fragment

> SELECT rand(null)

Result value

 0.8446490682263027

randn

randn([seed])

Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution.

Example of an SQL function usage fragment

> SELECT randn()

Result value

 -0.3254147983080288

Example of an SQL function usage fragment

> SELECT randn(0)

Result value

 1.1164209726833079

Example of an SQL function usage fragment

> SELECT randn(null)

Result value

 1.1164209726833079

rank

rank()
Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values produce gaps in the sequence.

regexp_extract

regexp_extract(str, regexp[, idx])

Extracts a group that matches regexp.

Example of an SQL function usage fragment

> SELECT regexp_extract('100-200', '(\\d+)-(\\d+)', 1)

Result value

 100

regexp_replace

regexp_replace(str, regexp, rep)

Replaces all substrings of str that match regexp with rep.

Example of an SQL function usage fragment

> SELECT regexp_replace('100-200', '(\\d+)', 'num')

Result value

 num-num

repeat

repeat(str, n)

Returns the string that repeats the indicated string value n times.

Example of an SQL function usage fragment

> SELECT repeat('123', 2)

Result value

 123123

replace

replace(str, search[, replace])

Replaces all occurrences of search with replace.

Arguments

  • str - A string expression.
  • search - A string expression. If search is not found in str, str is returned unchanged.
  • replace - A string expression. If replace is not specified or is an empty string, nothing replaces the string that is removed from str.

Example of an SQL function usage fragment

> SELECT replace('ABCabc', 'abc', 'DEF')

Result value

 ABCDEF

reverse

reverse(str)

Returns the reversed indicated string.

Example of an SQL function usage fragment

> SELECT reverse('Spark SQL')

Result value

 LQS krapS
right(str, len)

Returns the rightmost len(len can be string type) characters from the string str, if len is less or equal than 0 the result is an empty string.

Example of an SQL function usage fragment

> SELECT right('Spark SQL', 3)

Result value

 SQL

rint

rint(expr)

Returns the double value that is closest in value to the argument and is equal to a mathematical integer.

Example of an SQL function usage fragment

> SELECT rint(12.3456)

Result value

 12.0

rlike

str rlike regexp

Returns true if str matches regexp, or false otherwise.

Arguments

  • str - A string expression.
  • regexp - A string expression. The pattern string is a Javaâ„¢ regular expression.

String literals (including regex patterns) are unescaped in our SQL parser. For example, to match "\abc", a regular expression for regexp can be "^\abc$".

Example of an SQL function usage fragment

> SELECT '%SystemDrive%\Users\John' rlike '%SystemDrive%\\Users.*'

Result value

true

Note:

Use LIKE to match with simple string pattern.

rollup

round

round(expr, d)

Returns expr rounded to d decimal places by using HALF_UP rounding mode.

Example of an SQL function usage fragment

> SELECT round(2.5, 0)

Result value

 3.0

row_number

row_number()
Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition.

rpad

rpad(str, len, pad)

Returns str, right-padded with pad to a length of len. If str is longer than len, the return value is shortened to len characters.

Example of an SQL function usage fragment

> SELECT rpad('hi', 5, '??')

Result value

 hi???

Example of an SQL function usage fragment

> SELECT rpad('hi', 1, '??')

Result value

 h

rtrim

rtrim(str)

Removes the trailing space characters from str.

rtrim(trimStr, str)

Removes the trailing string that contains the characters from the trim string from the str.

Arguments

  • str - A string expression.
  • trimStr - The trim string characters to trim, the default value is a single space

Example of an SQL function usage fragment

> SELECT rtrim('    SparkSQL   ')

Result value

 SparkSQL

Example of an SQL function usage fragment

> SELECT rtrim('LQSa', 'SSparkSQLS')

Result value

 SSpark

second

second(timestamp)

Returns the second component of the string/timestamp.

Example of an SQL function usage fragment

> SELECT second('2009-07-30 12:58:59')

Result value

 59

sentences

sentences(str[, lang, country])

Splits str into an array of words.

Example of an SQL function usage fragment

> SELECT sentences('Hi there! Good morning.')

Result value

 [["Hi","there"],["Good","morning"]]

sequence

sequence(start, stop, step)

Generates an array of elements from start to stop (inclusive), incrementing by step. The type of the returned elements is the same as the type of argument expressions.

Supported types are: byte, short, integer, long, date, timestamp.

The start and stop expressions must resolve to the same type. If start and stop expressions resolve to the 'date' or 'timestamp' type then the step expression must resolve to the 'interval' or 'year-month interval' or 'day-time interval' type, otherwise to the same type as the start and stop expressions.

Arguments

  • start - An expression. The start of the range.
  • stop - An expression. The end the range (inclusive).
  • step - An optional expression. The step of the range. By default, step is 1 if start is less than or equal to stop, otherwise -1. For the temporal sequences, it's 1 day and -1 day respectively. If start is greater than stop, then the step must be negative, and vice versa.

Example of an SQL function usage fragment

> SELECT sequence(1, 5);

Result value

[1,2,3,4,5]

sha

sha(expr)

Returns a sha1 hash value as a hex string of the expr.

Example of an SQL function usage fragment

> SELECT sha('Spark')

Result value

 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c

sha1

sha1(expr)

Returns a sha1 hash value as a hex string of the expr.

Example of an SQL function usage fragment

> SELECT sha1('Spark')

Result value

 85f5955f4b27a9a4c2aab6ffe5d7189fc298b92c

sha2

sha2(expr, bitLength)

Returns a checksum of SHA-2 family as a hex string of expr. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256.

Example of an SQL function usage fragment

> SELECT sha2('Spark', 256)

Result value

 529bc3b07127ecb7e53a4dcf1991d9152c24537d919178022b2c42657f79a26b

shiftleft

shiftleft(base, expr)

Bitwise left shift.

Example of an SQL function usage fragment

> SELECT shiftleft(2, 1)

Result value

 4

shiftright

shiftright(base, expr)

Bitwise (signed) right shift.

Example of an SQL function usage fragment

> SELECT shiftright(4, 1)

Result value

 2

shiftrightunsigned

shiftrightunsigned(base, expr)

Bitwise unsigned right shift.

Example of an SQL function usage fragment

> SELECT shiftrightunsigned(4, 1)

Result value

 2

sign

sign(expr)

Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive.

Example of an SQL function usage fragment

> SELECT sign(40)

Result value

 1.0

signum

signum(expr)

Returns -1.0, 0.0 or 1.0 as expr is negative, 0 or positive.

Example of an SQL function usage fragment

> SELECT signum(40)

Result value

 1.0

sin

sin(expr)

Returns the sine of expr.

Example of an SQL function usage fragment

> SELECT sin(0)

Result value

 0.0

sinh

sinh(expr)

Returns the hyperbolic sine of expr.

Example of an SQL function usage fragment

> SELECT sinh(0)

Result value

 0.0

size

size(expr)

Returns the size of an array or a map. Returns -1 if null.

Example of an SQL function usage fragment

> SELECT size(array('b', 'd', 'c', 'a'))

Result value

 4

skewness

skewness(expr)
Returns the skewness value that is calculated from values of a group.

smallint

smallint(expr)
Casts the value expr to the target data type smallint.

sort_array

sort_array(array[, ascendingOrder])

Sorts the input array in ascending or descending order according to the natural ordering of the array elements.

Example of an SQL function usage fragment

> SELECT sort_array(array('b', 'd', 'c', 'a'), true)

Result value

 ["a","b","c","d"]

soundex

soundex(str)

Returns Soundex code of the string.

Example of an SQL function usage fragment

> SELECT soundex('Miller')

Result value

 M460

space

space(n)

Returns a string that consists of n spaces.

Example of an SQL function usage fragment

> SELECT concat(space(2), '1')

Result value

 1

spark_partition_id

spark_partition_id()
Returns the current partition ID.

split

split(str, regex)

Splits str around occurrences that match regex.

Example of an SQL function usage fragment

> SELECT split('oneAtwoBthreeC', '[ABC]')

Result value

 ["one","two","three",""]

sqrt

sqrt(expr)

Returns the square root of expr.

Example of an SQL function usage fragment

> SELECT sqrt(4)

Result value

 2.0

stack

stack(n, expr1, ..., exprk)

Separates expr1, ..., exprk into n rows.

Example of an SQL function usage fragment

> SELECT stack(2, 1, 2, 3)

Result value

1  2   3  NULL

std

std(expr)
Returns the sample standard deviation that is calculated from values of a group.

stddev

stddev(expr)
Returns the sample standard deviation that is calculated from values of a group.

stddev_pop

stddev_pop(expr)
Returns the population standard deviation that is calculated from values of a group.

stddev_samp

stddev_samp(expr)
Returns the sample standard deviation that is calculated from values of a group.

str_to_map

str_to_map(text[, pairDelim[, keyValueDelim]])

Creates a map after splitting the text into key/value pairs by using delimiters. Default delimiters are ',' for pairDelim and ':' for keyValueDelim.

Example of an SQL function usage fragment

> SELECT str_to_map('a:1,b:2,c:3', ',', ':')

Result value

map("a":"1","b":"2","c":"3")

Example of an SQL function usage fragment

> SELECT str_to_map('a')

Result value

 map("a":null)

string

string(expr)
Casts the value expr to the target data type string.

struct

struct(col1, col2, col3, ...)
Creates a struct with the indicated field values.

substr

substr(str, pos[, len])

Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len.

Example of an SQL function usage fragment

> SELECT substr('Spark SQL', 5)

Result value

 k SQL

Example of an SQL function usage fragment

> SELECT substr('Spark SQL', -3)

Result value

 SQL

Example of an SQL function usage fragment

> SELECT substr('Spark SQL', 5, 1)

Result value

 k

substring

substring(str, pos[, len])

Returns the substring of str that starts at pos and is of length len, or the slice of byte array that starts at pos and is of length len.

Example of an SQL function usage fragment

> SELECT substring('Spark SQL', 5)

Result value

 k SQL

Example of an SQL function usage fragment

> SELECT substring('Spark SQL', -3)

Result value

 SQL

Example of an SQL function usage fragment

> SELECT substring('Spark SQL', 5, 1)

Result value

 k

substring_index

substring_index(str, delim, count)

Returns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when you search for delim.

Example of an SQL function usage fragment

> SELECT substring_index('www.apache.org', '.', 2)

Result value

 www.apache

sum

sum(expr)
Returns the sum that is calculated from values of a group.

tan

tan(expr)

Returns the tangent of expr.

Example of an SQL function usage fragment

> SELECT tan(0)

Result value

 0.0

tanh

tanh(expr)

Returns the hyperbolic tangent of expr.

Example of an SQL function usage fragment

> SELECT tanh(0)

Result value

 0.0

timestamp

timestamp(expr)
Casts the value expr to the target data type timestamp.

tinyint

tinyint(expr)
Casts the value expr to the target data type tinyint.

to_date

to_date(date_str[, fmt])

Parses the date_str expression with the fmt expression to a date. Returns null with invalid input. By default, it follows casting rules to a date if the fmt is omitted.

Example of an SQL function usage fragment

> SELECT to_date('2009-07-30 04:17:52')

Result value

 2009-07-30

Example of an SQL function usage fragment

> SELECT to_date('2016-12-31', 'yyyy-MM-dd')

Result value

 2016-12-31

to_json

to_json(expr[, options])

Returns a JSON string with a indicated struct value.

Example of an SQL function usage fragment

> SELECT to_json(named_struct('a', 1, 'b', 2))

Result value

 {"a":1,"b":2}

Example of an SQL function usage fragment

> SELECT to_json(named_struct('time', to_timestamp('2015-08-26', 'yyyy-MM-dd')), map('timestampFormat', 'dd/MM/yyyy'))

Result value

 {"time":"26/08/2015"}

Example of an SQL function usage fragment

> SELECT to_json(array(named_struct('a', 1, 'b', 2))

Result value

 [{"a":1,"b":2}]

Example of an SQL function usage fragment

> SELECT to_json(map('a', named_struct('b', 1)))

Result value

 {"a":{"b":1}}

Example of an SQL function usage fragment

> SELECT to_json(map(named_struct('a', 1),named_struct('b', 2)))

Result value

 {"[1]":{"b":2}}

Example of an SQL function usage fragment

> SELECT to_json(map('a', 1))

Result value

 {"a":1}

Example of an SQL function usage fragment

> SELECT to_json(array((map('a', 1))))

Result value

 [{"a":1}]

to_timestamp

to_timestamp(timestamp[, fmt])

Parses the timestamp expression with the fmt expression to a timestamp. Returns null with invalid input. By default, it follows casting rules to a timestamp if the fmt is omitted.

Example of an SQL function usage fragment

> SELECT to_timestamp('2016-12-31 00:12:00')

Result value

2016-12-31 00:12:00

Example of an SQL function usage fragment

> SELECT to_timestamp('2016-12-31', 'yyyy-MM-dd')

Result value

 2016-12-31 00:00:00

to_unix_timestamp

to_unix_timestamp(expr[, pattern])

Returns the UNIX timestamp of the indicated time.

Example of an SQL function usage fragment

> SELECT to_unix_timestamp('2016-04-08', 'yyyy-MM-dd')

Result value

1460041200

to_utc_timestamp

to_utc_timestamp(timestamp, timezone)

Given a timestamp, such as '2017-07-14 02:40:00.0', this function interprets it as a time in the indicated time zone, and renders that time as a timestamp in UTC. For example, 'GMT+1' would yield '2017-07-14 01:40:00.0'.

Example of an SQL function usage fragment

> SELECT to_utc_timestamp('2016-08-31', 'Asia/Seoul')

Result value

2016-08-30 15:00:00

translate

translate(input, from, to)

Converts the input string by replacing the characters present in the from string with the corresponding characters in the to string.

Example of an SQL function usage fragment

> SELECT translate('AaBbCc', 'abc', '123')

Result value

 A1B2C3

trim

trim(str)

Removes the leading and trailing space characters from str.

trim(BOTH trimStr FROM str)

Removes the leading and trailing trimStr characters from str.

trim(LEADING trimStr FROM str)

Removes the leading trimStr characters from str.

trim(TRAILING trimStr FROM str)

Removes the trailing trimStr characters from str.

Arguments

  • str - A string expression.
  • trimStr - The trim string characters to trim, the default value is a single space.
  • BOTH, FROM - Keywords to specify trimming string characters from both ends of the string.
  • LEADING, FROM - Keywords to specify trimming string characters from the left end of the string.
  • TRAILING, FROM - Keywords to specify trimming string characters from the right end of the string.

Example of an SQL function usage fragment

> SELECT trim('    SparkSQL   ')

Result value

 SparkSQL

Example of an SQL function usage fragment

> SELECT trim('SL', 'SSparkSQLS')

Result value

 parkSQ

Example of an SQL function usage fragment

> SELECT trim(BOTH 'SL' FROM 'SSparkSQLS')

Result value

 parkSQ

Example of an SQL function usage fragment

> SELECT trim(LEADING 'SL' FROM 'SSparkSQLS')

Result value

 parkSQLS

Example of an SQL function usage fragment

> SELECT trim(TRAILING 'SL' FROM 'SSparkSQLS')

Result value

 SSparkSQ

trunc

trunc(date, fmt)

Returns date with the time portion of the day truncated to the unit specified by the format model fmt. fmt is one of ["year", "yyyy", "yy", "mon", "month", "mm"].

Example of an SQL function usage fragment

> SELECT trunc('2009-02-12', 'MM')

Result value

 2009-02-01

Example of an SQL function usage fragment

> SELECT trunc('2015-10-27', 'YEAR')

Result value

 2015-01-01

ucase

ucase(str)

Returns str with all characters that are changed to uppercase.

Example of an SQL function usage fragment

> SELECT ucase('SparkSql')

Result value

 SPARKSQL

unbase64

unbase64(str)

Converts the argument from a base 64 string str to a binary.

Example of an SQL function usage fragment

> SELECT unbase64('U3BhcmsgU1FM')

Result value

 Spark SQL

unhex

unhex(expr)

Converts hexadecimal expr to binary.

Example of an SQL function usage fragment

> SELECT decode(unhex('537061726B2053514C'), 'UTF-8')

Result value

 Spark SQL

unix_timestamp

unix_timestamp([expr[, pattern]])

Returns the UNIX timestamp of current or specified time.

Example of an SQL function usage fragment

> SELECT unix_timestamp()

Result value

 1476884637

Example of an SQL function usage fragment

> SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd')

Result value

 1460041200

upper

upper(str)

Returns str with all characters that are changed to uppercase.

Example of an SQL function usage fragment

> SELECT upper('SparkSql')

Result value

 SPARKSQL

uuid

uuid()

Returns a universally unique identifier (UUID) string. The value is returned as a canonical UUID 36-character string.

Example of an SQL function usage fragment

> SELECT uuid()

Result value

 46707d92-02f4-4817-8116-

var_pop

var_pop(expr)
Returns the population variance that is calculated from values of a group.

var_samp

var_samp(expr)
Returns the sample variance that is calculated from values of a group.

variance

variance(expr)
Returns the sample variance that is calculated from values of a group.

weekofyear

weekofyear(date)

Returns the week of the year of the indicated date. A week is considered to start on a Monday and week 1 is the first week with >3 days.

Example of an SQL function usage fragment

> SELECT weekofyear('2008-02-20')

Result value

 8

when

CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When expr1 = true, returns expr2; else when expr3 = true, returns expr4; else returns expr5.

Arguments

  • expr1, expr3 - the branch condition expressions must all be Boolean type.
  • expr2, expr4, expr5 - the branch value expressions and else value expression must all be same type or coercible to a common type.

Example of an SQL function usage fragment

> SELECT CASE WHEN 1 > 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END;

Result value

 1

Example of an SQL function usage fragment

> SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 > 0 THEN 2.0 ELSE 1.2 END;

Result value

 2

Example of an SQL function usage fragment

> SELECT CASE WHEN 1 < 0 THEN 1 WHEN 2 < 0 THEN 2.0 END;

Result value

NULL

window

xpath

xpath(xml, xpath)

Returns a string array of values within the nodes of xml that match the XPath expression.

Example of an SQL function usage fragment

> SELECT xpath('<a><b>b1</b><b>b2</b><b>b3</b><c>c1</c><c>c2</c></a>','a/b/text()')

Result value

 ['b1','b2','b3']

xpath_boolean

xpath_boolean(xml, xpath)

Returns true if the XPath expression evaluates to true, or if a matching node is found.

Example of an SQL function usage fragment

> SELECT xpath_boolean('<a><b>1</b></a>','a/b')

Result value

 true

xpath_double

xpath_double(xml, xpath)

Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is nonnumeric.

Example of an SQL function usage fragment

> SELECT xpath_double('<a><b>1</b><b>2</b></a>', 'sum(a/b)')

Result value

 3.0

xpath_float

xpath_float(xml, xpath)

Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is nonnumeric.

Example of an SQL function usage fragment

> SELECT xpath_float('<a><b>1</b><b>2</b></a>', 'sum(a/b)')

Result value

 3.0

xpath_int

xpath_int(xml, xpath)

Returns an integer value, or the value zero if no match is found, or a match is found but the value is nonnumeric.

Example of an SQL function usage fragment

> SELECT xpath_int('<a><b>1</b><b>2</b></a>', 'sum(a/b)')

Result value

 3

xpath_long

xpath_long(xml, xpath)

Returns a long integer value, or the value zero if no match is found, or a match is found but the value is nonnumeric.

Example of an SQL function usage fragment

> SELECT xpath_long('<a><b>1</b><b>2</b></a>', 'sum(a/b)')

Result value

 3

xpath_number

xpath_number(xml, xpath)

Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is nonnumeric.

Example of an SQL function usage fragment

> SELECT xpath_number('<a><b>1</b><b>2</b></a>', 'sum(a/b)')

Result value

 3.0

xpath_short

xpath_short(xml, xpath)

Returns a short integer value, or the value zero if no match is found, or a match is found but the value is nonnumeric.

Example of an SQL function usage fragment

> SELECT xpath_short('<a><b>1</b><b>2</b></a>', 'sum(a/b)')

Result value

 3

xpath_string

xpath_string(xml, xpath)

Returns the text contents of the first xml node that matches the XPath expression.

Example of an SQL function usage fragment

> SELECT xpath_string('<a><b>b</b><c>cc</c></a>','a/c')

Result value

 cc

year

year(date)

Returns the year component of the date/timestamp.

Example of an SQL function usage fragment

> SELECT year('2016-07-30')

Result value

 2016

|

expr1 | expr2

Returns the result of bitwise OR of expr1 and expr2.

Example of an SQL function usage fragment

> SELECT 3 | 5;

Result value

 7
## ~

expr

Returns the result of bitwise NOT of expr.

Example of an SQL function usage fragment

> SELECT ~ 0;

Result value

 -1