The Contains Duplicate problem is a typical algorithmic challenge that involves determining whether a given array of integers contains any duplicate elements. This problem is fundamental in understanding array manipulation and hashing techniques.

In this post, we will explore several ways to solve this problem in Rust, starting with a brute-force approach, followed by a sorting-based solution, and concluding with the most efficient method using a HashSet.

Problem Definition

You are given an array of integers nums. The task is to determine if any value appears at least twice in the array. Return true if any value appears more than once, and false otherwise.

Example:

Input: nums = [1, 2, 3, 1]
Output: true

Input: nums = [1, 2, 3, 4]
Output: false

Input: nums = [1, 1, 1, 3, 3, 4, 3, 2, 4, 2]
Output: true

Constraints:

  • The length of the array can be up to (10^5) elements.
  • Each element can range from (-10^9) to (10^9).

Let's explore three different ways to solve this problem using Rust.

Solution 1: Brute-Force Approach

Approach:

In the brute-force approach, we compare every element with every other element in the array to see if any two elements are the same. If two elements are found to be equal, we return true. If no duplicates are found, we return false.

Code Implementation:

fn contains_duplicate(nums: Vec<i32>) -> bool {
    for i in 0..nums.len() {
        for j in i + 1..nums.len() {
            if nums[i] == nums[j] {
                return true;  // Duplicate found
            }
        }
    }
    false  // No duplicates found
}

fn main() {
    let nums = vec![1, 2, 3, 1];
    println!("{}", contains_duplicate(nums));  // Output: true
}

Time and Space Complexity:

  • Time Complexity: O(n²) due to two nested loops.
  • Space Complexity: O(1) since no extra space is used other than a few variables.

Explanation:

This brute-force solution uses two nested loops. The outer loop selects an element, and the inner loop compares it with every other element that follows. If two elements are the same, the function returns true. If no duplicates are found after all comparisons, the function returns false.

Drawbacks:

The brute-force approach is inefficient for large arrays because it has a time complexity of O(n²). This makes it impractical for arrays with thousands of elements or more.

Solution 2: Sorting Approach

Approach:

A more efficient way to solve this problem is by first sorting the array. After sorting, any duplicates will appear next to each other. Then, we simply check if any two consecutive elements are the same.

Code Implementation:

fn contains_duplicate(mut nums: Vec<i32>) -> bool {
    nums.sort();  // Sort the array
    for i in 1..nums.len() {
        if nums[i] == nums[i - 1] {  // Compare adjacent elements
            return true;
        }
    }
    false  // No duplicates found
}

fn main() {
    let nums = vec![1, 2, 3, 1];
    println!("{}", contains_duplicate(nums));  // Output: true
}

Time and Space Complexity:

  • Time Complexity: O(n log n) due to sorting.
  • Space Complexity: O(1) assuming sorting is done in place.

Explanation:

This approach works by first sorting the array using Rust’s sort() method. After the array is sorted, we iterate through the array and compare adjacent elements. If any two consecutive elements are the same, we return true. If no duplicates are found, we return false.

Benefits and Drawbacks:

  • Benefits: More efficient than the brute-force approach, as sorting reduces the number of comparisons.
  • Drawbacks: Sorting modifies the original array, and the time complexity is still O(n log n), which may not be fast enough for very large arrays.

Solution 3: Using a HashSet (Optimal Solution)

Approach:

The most efficient way to solve this problem is by using a HashSet. A HashSet allows for O(1) average-time complexity for insertions and lookups. As we iterate through the array, we can check if an element already exists in the set. If it does, we return true. If not, we add the element to the set and continue.

Code Implementation:

use std::collections::HashSet;

fn contains_duplicate(nums: Vec<i32>) -> bool {
    let mut seen = HashSet::new();  // Create an empty HashSet
    for num in nums {
        if seen.contains(&num) {  // Check if the element is already in the set
            return true;  // Duplicate found
        }
        seen.insert(num);  // Add the element to the set
    }
    false  // No duplicates found
}

fn main() {
    let nums = vec![1, 2, 3, 1];
    println!("{}", contains_duplicate(nums));  // Output: true
}

Time and Space Complexity:

  • Time Complexity: O(n) because we iterate through the array once, and set operations (insertion and lookup) are O(1) on average.
  • Space Complexity: O(n) due to the extra space used by the HashSet.

Explanation:

This approach uses a HashSet to store the elements of the array as we iterate through it. If an element already exists in the set, we immediately return true. Otherwise, the element is added to the set. If no duplicates are found by the end of the loop, we return false.

Benefits and Drawbacks:

  • Benefits: This is the most efficient solution with linear time complexity and is easy to implement.
  • Drawbacks: The extra memory required to store the set makes the space complexity O(n). However, this trade-off is usually acceptable given the performance benefits.

Edge Cases

When solving the Contains Duplicate problem in Rust, it's essential to account for various edge cases to ensure that your solution works in all scenarios. Below are some important edge cases to consider and test when implementing your solution:

1. Empty Array:

If the input array is empty, there are no elements to compare, so the function should return false.

Example:

let nums: Vec<i32> = vec![];
println!("{}", contains_duplicate(nums));  // Output: false
  • Expected Output: Since there are no elements, there cannot be any duplicates, so the function should return false.

2. Array with One Element:

If the array contains only one element, it’s impossible to have duplicates, so the function should return false.

Example:

let nums = vec![1];
println!("{}", contains_duplicate(nums));  // Output: false
  • Expected Output: Since there is only one element, it cannot have duplicates, so the function should return false.

3. Array with All Unique Elements:

If the input array contains all unique elements (i.e., no duplicates), the function should return false.

Example:

let nums = vec![1, 2, 3, 4, 5];
println!("{}", contains_duplicate(nums));  // Output: false
  • Expected Output: All elements in the array are unique, so the function should return false.

4. Array with All Duplicate Elements:

If the input array contains the same element repeated multiple times, the function should return true.

Example:

let nums = vec![1, 1, 1, 1];
println!("{}", contains_duplicate(nums));  // Output: true
  • Expected Output: Since every element is the same, the function should return true.

5. Array with Negative Numbers:

The array may contain negative numbers. The solution should handle these correctly and identify duplicates among them.

Example:

let nums = vec![-1, -2, -3, -1];
println!("{}", contains_duplicate(nums));  // Output: true
  • Expected Output: The element -1 appears twice, so the function should return true.

6. Array with Negative and Positive Numbers:

The array may contain both negative and positive numbers. Ensure the solution handles and distinguishes between these correctly.

Example:

let nums = vec![-1, 2, -3, 3];
println!("{}", contains_duplicate(nums));  // Output: false
  • Expected Output: Since all elements are unique, the function should return false.

7. Array with Zeros:

The array may contain zeros, and the solution should handle multiple zeros correctly.

Example:

let nums = vec![0, 1, 2, 0];
println!("{}", contains_duplicate(nums));  // Output: true
  • Expected Output: The element 0 appears twice, so the function should return true.

8. Array with Mixed Duplicates and Unique Elements:

The array may contain a mix of unique and duplicate elements. The solution should return true as soon as a duplicate is found.

Example:

let nums = vec![1, 2, 3, 2];
println!("{}", contains_duplicate(nums));  // Output: true
  • Expected Output: The element 2 appears twice, so the function should return true.

9. Array with Large Numbers:

The array may contain very large or very small integers within the range [-10^9, 10^9]. Ensure that the solution can handle these extreme values correctly.

Example:

let nums = vec![-1000000000, 1000000000];
println!("{}", contains_duplicate(nums));  // Output: false
  • Expected Output: Since the two elements are different and within the valid range, the function should return false.

10. Array with Maximum Length (100,000 elements):

If the array contains the maximum number of elements (100,000), the solution should handle the large dataset efficiently. This is especially important for the brute-force solution, which may fail or run too slowly for such input sizes.

Conclusion

In this post, we explored three different ways to solve the Contains Duplicate problem in Rust:

  1. Brute-force approach: Simple but inefficient with O(n²) time complexity.
  2. Sorting approach: More efficient with O(n log n) time complexity, but modifies the original array.
  3. HashSet approach: The optimal solution with O(n) time complexity and O(n) space complexity, making it the best choice for large datasets.

In practice, the HashSet solution is often the most efficient and straightforward approach, especially when working with large arrays.