Java Regex

What is a Regular Expression (Regex)?

A regular expression (regex) is a pattern that specifies a set of strings. In Java, regular expressions are used for pattern matching within strings. They provide a powerful and flexible way to perform tasks like searching, extracting, and replacing text.

Java supports regular expressions through the java.util.regex package, which includes the following important classes:

Pattern: A compiled representation of a regular expression.

Matcher: Used to perform the matching operations on an input string.

Basic Syntax of Java Regular Expressions

A regex pattern in Java can consist of literals, metacharacters, and quantifiers that define the string matching behavior. Below is a breakdown of the basic syntax and examples:

1. Literals: A literal character matches itself.

Example:


//LiteralExample.java file
import java.util.regex.*;

public class LiteralExample {
    public static void main(String[] args) {
        String input = "hello world!";
        
        // Regex: Match the literal "hello"
        String regex = "hello";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        if (matcher.find()) {
            System.out.println("Found literal: " + matcher.group());  // Output: "hello"
        } else {
            System.out.println("No match found.");
        }
    }
}

Output:

Found literal: hello

Explanation: The regex “hello” matches the literal string “hello” in the input “hello world!”.

2. Metacharacters: Special characters that have a specific meaning in regex. These are:

. (dot) — Matches any single character except newline.

Example: a.c will match abc, axc, etc., but not ac.


//DotExample.java file
import java.util.regex.*;

public class DotExample {
    public static void main(String[] args) {
        String input = "abc acd axd";
        
        // Regex: Match any character between "a" and "c"
        String regex = "a.c";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        while (matcher.find()) {
            System.out.println("Found match: " + matcher.group());  // Output: "abc"
        }
    }
}

Output:

Found match: abc

Explanation: The . matches any single character, so “a.c” matches abc, acd, and axd.

^ (caret) — Anchors the match to the beginning of a string.

Example:


//CaretExample.java file
import java.util.regex.*;

public class CaretExample {
    public static void main(String[] args) {
        String input = "hello world";
        
        // Regex: Match "hello" at the start of the string
        String regex = "^hello";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        if (matcher.find()) {
            System.out.println("Found match: " + matcher.group());  // Output: "hello"
        }
    }
}

Output:

Found match: hello

Explanation: The ^ ensures that “hello” must appear at the start of the string.

$ (dollar) — Anchors the match to the end of a string.

Example: abc$ will match “abc” only if it is at the end of the string.


//DollarExample.java file
import java.util.regex.*;

public class DollarExample {
    public static void main(String[] args) {
        String input = "hello world";
        
        // Regex: Match "world" at the end of the string
        String regex = "world$";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        if (matcher.find()) {
            System.out.println("Found match: " + matcher.group());  // Output: "world"
        }
    }
}

Output:

Found match: world

Explanation: The $ ensures that “world” must appear at the end of the string.

(asterisk) — Matches zero or more occurrences of the preceding element.

Example: a*b matches b, ab, aab, aaab, etc.

(plus) — Matches one or more occurrences of the preceding element.

Example: a+b matches ab, aab, aaab, but not b.

? (question mark) — Matches zero or one occurrence of the preceding element.

Example: a?b matches b and ab.

{} (curly braces) — Specifies the exact number of occurrences.

Example: a{2} matches exactly two as, i.e., aa.

[] (square brackets) — Matches any one of the characters inside the brackets.

Example: [abc] matches either a, b, or c.

3. Character Classes:

\d: Matches any digit (0-9).

Example 1: \d – Matches a digit


//DigitExample.java file
import java.util.regex.*;

public class DigitExample {
    public static void main(String[] args) {
        String input = "There are 123 apples";
        
        // Regex: Match digits
        String regex = "\\d";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        while (matcher.find()) {
            System.out.println("Found digit: " + matcher.group());  // Output: "1", "2", "3"
        }
    }
}

Output:

Found digit: 1
Found digit: 2
Found digit: 3

Explanation: The \d matches any digit from 0-9.

\D: Matches any non-digit character.

\w: Matches any word character (letters, digits, or underscore).

Example: \w – Matches a word character (letters, digits, underscores)


//WordCharacterExample.java file
import java.util.regex.*;

public class WordCharacterExample {
    public static void main(String[] args) {
        String input = "user_123";
        
        // Regex: Match word characters
        String regex = "\\w";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        while (matcher.find()) {
            System.out.println("Found word character: " + matcher.group());  // Output: "u", "s", "e", "_", "1", "2", "3"
        }
    }
}

Output:

Found word character: u
Found word character: s
Found word character: e
Found word character: _
Found word character: 1
Found word character: 2
Found word character: 3

Explanation: The \w matches any letter, digit, or underscore.

\W: Matches any non-word character.

\s: Matches any whitespace character (space, tab, newline).

Example 3: \s – Matches whitespace characters


//WhitespaceExample.java file
import java.util.regex.*;

public class WhitespaceExample {
    public static void main(String[] args) {
        String input = "Hello world! How are you?";
        
        // Regex: Match whitespace characters
        String regex = "\\s";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        while (matcher.find()) {
            System.out.println("Found whitespace: " + matcher.group());  // Output: " " (space)
        }
    }
}

Output:

Found whitespace:
Found whitespace:
Found whitespace:
Found whitespace:

Explanation: The \s matches any whitespace character like space, tab, or newline.

\S: Matches any non-whitespace character.

4. Groups and Alternation

() (parentheses) — Groups patterns.

Example: () (Group) – Groups multiple characters


//GroupExample.java file
import java.util.regex.*;

public class GroupExample {
    public static void main(String[] args) {
        String input = "cat bat mat";
        
        // Regex: Group and match "at"
        String regex = "(cat|bat|mat)";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        while (matcher.find()) {
            System.out.println("Found group match: " + matcher.group());  // Output: "cat", "bat", "mat"
        }
    }
}

Output:

Found group match: cat
Found group match: bat
Found group match: mat

Explanation: The (cat|bat|mat) group matches any of the three words cat, bat, or mat.

| (pipe) — Alternation, matches either the pattern before or after the pipe.

Example: | (Alternation) – Matches either one pattern or another


//AlternationExample.java file
import java.util.regex.*;

public class AlternationExample {
    public static void main(String[] args) {
        String input = "John Mark Tom";
        
        // Regex: Match either "John" or "Tom"
        String regex = "John|Tom";
        
        // Create a Pattern object
        Pattern pattern = Pattern.compile(regex);
        
        // Create a Matcher object
        Matcher matcher = pattern.matcher(input);
        
        // Find and print matches
        while (matcher.find()) {
            System.out.println("Found alternation match: " + matcher.group());  // Output: "John", "Tom"
        }
    }
}

Output:

Found alternation match: John
Found alternation match: Tom

Explanation: The John|Tom alternation matches either John or Tom in the string.

Java Regex – Interview Questions

Q 1: What is regex in Java?

Ans: Regex is a pattern used for string matching.

Q 2: Which package supports regex?

Ans: java.util.regex

Q 3: What is Pattern class?

Ans: Used to define a regex pattern.

Q 4: What is Matcher class?

Ans: Used to match pattern against input string.

Q 5: Where is regex commonly used?

Ans: Input validation and text processing.

Java Regex – Objective Questions (MCQs)

Q1. Which package in Java contains classes for Regular Expressions?






Q2. Which class is used to define a pattern in Java Regex?






Q3. Which method of the Matcher class is used to check if the entire input sequence matches the pattern?






Q4. What does the regex symbol \d represent in Java?






Q5. What will the following code return?

Pattern.matches('[a-z]+', 'Java');






Related Java Regex Topics