Monday, August 14, 2017

REGEXP_COUNT Tips

This function was introduced in Oracle 11g and it allows us to count the number of times a pattern or a substring occurs in the source string. This is very different from the plain COUNT function which is a column level function, whereas, REGEXP_COUNT is a cell level function which operates on each cell individually.

The prototype of the REGEXP_COUNT function is shown below,

REGEXP_COUNT(<Source_string>, <Search_pattern>[, <Start_position>[, <Match_modifiers>]])

·         Source_string: The string to be searched for.

·         Search_pattern: The regular expression pattern that is to be searched for in the source string. This can be a combination of the POSIX and the Perl-influenced metacharacters mentioned in the above section.

·         Start_position: This is an optional parameter. This determines the position in the source string where the search starts. By default, it is 1, which is the starting position of the source string.

·         Match_modifiers: This is an optional parameter. This parameter allows us to modify, the matching behavior of the function. The valid range of options is mentioned in the Pattern Matching Modifiers section explained above.
Text Count
The basic operation, which can be done using this function is to count for a character or a series of characters in the source string as shown below.

SELECT REGEXP_COUNT ('REGEXP_COUNT is a cell level function which operates on each cell individually', 'cell', 1) regexp_count
FROM dual;

Result:
2
Digit Count
The below statement, searches for the number of numeric digits present in the source string using the Perl influenced metacharacter \d.

SELECT REGEXP_COUNT ('REGEXP_COUNT is introduced in the Oracle version 11gR1', '\d', 1) regexp_count
FROM dual;

Result:
3
Vowels Count
The below shown listing counts for the total number of vowels in the source string with case sensitivity check turned ON by default (Based on the NLS_SORT parameter).

SELECT REGEXP_COUNT ('REGEXP_COUNT is a cell level function which operates on each cell individually', '[aeiou]', 1) regexp_count
FROM dual;

Result:
22

The same snippet is used again for the vowel count, but this time with the case sensitivity turned OFF by setting the match modifier to manually.

SELECT REGEXP_COUNT ('REGEXP_COUNT is a cell level function which operates on each cell individually', '[aeiou]', 1, 'i') regexp_count
FROM dual;

Result:
26
Word Count
The total number of words in the source string can be found by using the below statement. The search pattern “[^ ]+” checks for characters other than the space character “[^ ]” followed by one or more non-space characters using the “+” operator.

SELECT REGEXP_COUNT ('REGEXP_COUNT is introduced in the Oracle version 11gR1', '[^ ]+',1) regexp_count
FROM dual;

Result:
8
Character count
The total number of characters in the source string can be found by using the below statement. The search pattern “( . )” searches for all possible characters in the source string.

SELECT REGEXP_COUNT ('REGEXP_COUNT is introduced in the Oracle version 11gR1', '(.)', 1) regexp_count
FROM dual;

Result:
54
DOT Count
The total number of DOT characters in the source string can be found using the below query statement. The search pattern “\.” searches for the literal DOT in the source string. As DOT is a metacharacter, it is escaped using the escape operator backslash “\”.

SELECT REGEXP_COUNT ('There.are.seven.dots.in.this.string.', '\.',1) regexp_count
FROM dual;

Result:
7

No comments:

Post a Comment