Using regular expressions in Oracle SQL queries, pt. I

This is an article on Oracle and using of regular expressions. Oracle has several regexp functions. Each will be described with an example below.

First things first

If you have access to existing Oracle database you can skip to the next paragraph.
If you don’t have Oracle database installed already, please follow these steps:

  • register at oracle.com
  • download and install latest possible Oracle Database*
* at the moment you can get 10g Express Edition [Universal] here:
http://www.oracle.com/technetwork/database/express-edition/downloads/102xewinsoft-090667.html [^]

Using regular expressions in Oracle

There are five regexp methods implemented in Oracle:

  • REGEXP_LIKE
  • REGEXP_SUBSTR
  • REGEXP_REPLACE
  • REGEXP_INSTR
  • REGEXP_COUNT

Basically they work in exactly same way as other Oracle methods.

REGEXP functions parameters:

source - source string.

regexp - regular expression.

position - position of the character in the source string at which the match should start.

occurrence - specifies witch match to get from source string.

replacement - in REGEXP_REPLACE specifies what is the replacement for matched regexp in source string. If omitted, matching regexp value will be deleted.

return_option - in REGEXP_INSTR: set to 0 to get position of first character in match, set to 1 to get position of first character after the match.

mode - is a string of up to three (from four: i, c, n, m) characters.

  • i: turns off case sensitive matching.
  • c: turns on case sensitive matching.
  • n: '.' match any character, including newlines. DEFAULT: any character except newlines.
  • m: '^' and '$' match at the start and end of each line. DEFAULT: the match at the very start and the very end of the string.

REGEXP_LIKE(source, regexp, modes*)

SELECT 'true' FROM dual 
WHERE REGEXP_LIKE('Hello world', 'l{1,2}');
Returns true if source string contains substring 'l' or 'll'.

REGEXP_SUBSTR(source, regexp, position*, occurrence*, modes*)

SELECT REGEXP_SUBSTR('Hello World!', 'W([a-z]+)') FROM dual;
Returns a substring matching 'W' plus additional leters (more then one). Which is 'World'

REGEXP_REPLACE(source, regexp, replacement*, position*, occurrence*, modes*)

SELECT REGEXP_REPLACE('Hello World!', 'W([a-z]+)') FROM dual;
Returns 'Hello !' because we deleted 'World'

REGEXP_INSTR(source, regexp, position*, occurrence*, return_option*, modes*)

SELECT REGEXP_INSTR('Hello World!', '([A-Z])', 1, 2, 0, 'c') FROM dual;
Returns position of 2nd capitalic letter. Which is 7.

REGEXP_COUNT(source, regexp, position*, mode*) **

* optional parameter
** not available in Oracle 10g Express Edition

Thanks for all your comments.

Comments

Post new comment

  • Lines and paragraphs are automatically recognized. The <br /> line break, <p> paragraph and </p> close paragraph tags are inserted automatically. If paragraphs are not recognized simply add a couple blank lines.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
2 + 3 =
This simple math problem is designed for BOT - you just have to enter '7'.