The Valid Anagram problem is a common coding challenge that requires determining whether two strings are anagrams of each other. Two strings are considered anagrams if they contain the same characters with the same frequencies, but may appear in different orders. For example, "listen" and "silent" are anagrams.

In this post, we’ll explore different approaches to solving the Valid Anagram problem in Java. We’ll go over various methods, analyze time and space complexity, and cover edge cases to ensure the solution is robust.

Problem Definition

Given two strings s and t, write a function isAnagram that determines whether t is an anagram of s.

Example:

Input: s = "anagram", t = "nagaram"
Output: true

Input: s = "rat", t = "car"
Output: false

Constraints:

  • Both s and t consist of lowercase English letters.
  • The length of both strings can be up to (10^5).

Approach 1: Sorting-Based Solution

Explanation:

A straightforward way to determine if two strings are anagrams is by sorting both strings. If the sorted versions of the strings are identical, then they must be anagrams.

Code Implementation:

import java.util.Arrays;

public class ValidAnagram {
    public static boolean isAnagram(String s, String t) {
        // If the lengths of the strings are different, they can't be anagrams
        if (s.length() != t.length()) {
            return false;
        }
        
        // Sort both strings and compare
        char[] sArr = s.toCharArray();
        char[] tArr = t.toCharArray();
        Arrays.sort(sArr);
        Arrays.sort(tArr);
        
        return Arrays.equals(sArr, tArr);
    }

    public static void main(String[] args) {
        System.out.println(isAnagram("anagram", "nagaram"));  // Output: true
        System.out.println(isAnagram("rat", "car"));  // Output: false
    }
}

Time and Space Complexity:

  • Time Complexity: O(n log n), where n is the length of the string due to the sorting operation.
  • Space Complexity: O(n), as sorting creates new arrays for the sorted versions of the strings.

Explanation:

In this approach, both strings are sorted and compared. If the sorted versions are identical, the strings are anagrams. Sorting takes O(n log n) time, making this solution less efficient for very large inputs.

Approach 2: Frequency Count Using Hash Maps (Optimal Solution)

Explanation:

A more efficient approach is to count the frequency of each character in both strings and compare these frequencies. We can use an integer array of size 26 (since the input consists of lowercase English letters) to count the occurrences of characters. If both strings have the same character frequencies, they are anagrams.

Code Implementation:

public class ValidAnagram {
    public static boolean isAnagram(String s, String t) {
        // If the lengths are different, they can't be anagrams
        if (s.length() != t.length()) {
            return false;
        }

        // Create an array to count the frequency of each character in both strings
        int[] charCount = new int[26];

        // Count characters in both strings
        for (int i = 0; i < s.length(); i++) {
            charCount[s.charAt(i) - 'a']++;  // Increment count for s
            charCount[t.charAt(i) - 'a']--;  // Decrement count for t
        }

        // Check if all counts are zero
        for (int count : charCount) {
            if (count != 0) {
                return false;
            }
        }

        return true;
    }

    public static void main(String[] args) {
        System.out.println(isAnagram("anagram", "nagaram"));  // Output: true
        System.out.println(isAnagram("rat", "car"));  // Output: false
    }
}

Time and Space Complexity:

  • Time Complexity: O(n), where n is the length of the strings because we iterate over each string once.
  • Space Complexity: O(1), since we use a fixed-size array (26 elements) to store character frequencies.

Explanation:

This approach counts the frequency of characters in both strings using a single array of size 26. The time complexity is O(n), making this approach optimal for large inputs.

Edge Cases

When solving the Valid Anagram problem, you should handle a number of edge cases:

  1. Different Length Strings:
    If the two strings have different lengths, they cannot be anagrams. The solution should return false immediately for this case.

    String s = "abcd";
    String t = "abc";
    System.out.println(isAnagram(s, t));  // Output: false
    
  2. Empty Strings:
    Two empty strings are considered anagrams because they contain the same (zero) characters.

    String s = "";
    String t = "";
    System.out.println(isAnagram(s, t));  // Output: true
    
  3. Single Character Strings:
    Strings with one character can only be anagrams if the characters are the same.

    String s = "a";
    String t = "a";
    System.out.println(isAnagram(s, t));  // Output: true
    
    String s = "a";
    String t = "b";
    System.out.println(isAnagram(s, t));  // Output: false
    
  4. Case Sensitivity:
    The problem assumes that the input strings consist of lowercase letters. If case sensitivity were a factor, "A" and "a" would not be considered the same. This edge case should be handled if necessary.

    String s = "a";
    String t = "A";
    System.out.println(isAnagram(s, t));  // Output: false
    
  5. Non-Alphabetic Characters:
    If the input contains non-alphabetic characters (such as numbers, punctuation, or spaces), those characters should also be included in the anagram check. You should handle this based on problem requirements.

    String s = "a.b";
    String t = "b.a";
    System.out.println(isAnagram(s, t));  // Output: true
    

Conclusion

In this post, we explored two different approaches to solve the Valid Anagram problem in Java. The sorting-based approach is easy to implement but has an O(n log n) time complexity. The frequency count approach is more efficient, achieving O(n) time complexity with O(1) space complexity by using an array to count character occurrences. By handling edge cases properly, you can ensure that your solution works for all possible inputs.