87. Scramble String


Problem Description

The problem gives us a possible operation on a string where we can take any non-empty string (other than a single character) and split it into two non-empty substrings. We then have the option to either swap these two substrings or leave them as they are. This process can be applied recursively to the new substrings generated from the split. With these rules in mind, we are asked to determine if a given string s2 can be made from another string s1 through one or more of the mentioned scrambling operations, assuming s1 and s2 are of the same length.

Intuition

The core of the solution lies in understanding that the problem can be broken down into smaller subproblems. This is a classic example of a divide and conquer algorithm. Given two strings s1 and s2, our goal is to see if there is a way to split both strings into substrings such that, after potentially swapping the substrings, the smaller pieces are equal or can themselves be further scrambled to be equal.

Taking this intuition, we can design a recursive function that tries to split the strings into two parts in various ways and then checks if after swapping or not swapping, the resulting substrings could be made equal through further recursion. To optimize the recursive solution and eliminate redundant calculations, we use memorization, which stores the results of subproblems that have already been solved.

The function dfs(i, j, k) represents a recursive depth-first search with memorization that checks if the substring of s1 starting at index i and of length k can be transformed into the substring of s2 starting at index j and of the same length. This is done by checking, for each possible split, both cases: keeping the order of the substrings untouched, and swapping them. If any of these cases yields true, we can conclude that s2 is a scrambled version of s1.

To implement this efficiently, we use the cache decorator (from Python's functools module) to memorize the results of subproblems, thus ensuring that our algorithm runs in polynomial time. The recursive function will stop splitting once it reaches substrings of length 1, as those cannot be split further, and a simple character comparison will suffice to continue the recursion. Without memorization, the solution would be prohibitively slow due to the exponential growth of possible splits as the string length increases.

Learn more about Dynamic Programming patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece

What's the output of running the following function using input [30, 20, 10, 100, 33, 12]?

1def fun(arr: List[int]) -> List[int]:
2    import heapq
3    heapq.heapify(arr)
4    res = []
5    for i in range(3):
6        res.append(heapq.heappop(arr))
7    return res
8
1public static int[] fun(int[] arr) {
2    int[] res = new int[3];
3    PriorityQueue<Integer> heap = new PriorityQueue<>();
4    for (int i = 0; i < arr.length; i++) {
5        heap.add(arr[i]);
6    }
7    for (int i = 0; i < 3; i++) {
8        res[i] = heap.poll();
9    }
10    return res;
11}
12
1class HeapItem {
2    constructor(item, priority = item) {
3        this.item = item;
4        this.priority = priority;
5    }
6}
7
8class MinHeap {
9    constructor() {
10        this.heap = [];
11    }
12
13    push(node) {
14        // insert the new node at the end of the heap array
15        this.heap.push(node);
16        // find the correct position for the new node
17        this.bubble_up();
18    }
19
20    bubble_up() {
21        let index = this.heap.length - 1;
22
23        while (index > 0) {
24            const element = this.heap[index];
25            const parentIndex = Math.floor((index - 1) / 2);
26            const parent = this.heap[parentIndex];
27
28            if (parent.priority <= element.priority) break;
29            // if the parent is bigger than the child then swap the parent and child
30            this.heap[index] = parent;
31            this.heap[parentIndex] = element;
32            index = parentIndex;
33        }
34    }
35
36    pop() {
37        const min = this.heap[0];
38        this.heap[0] = this.heap[this.size() - 1];
39        this.heap.pop();
40        this.bubble_down();
41        return min;
42    }
43
44    bubble_down() {
45        let index = 0;
46        let min = index;
47        const n = this.heap.length;
48
49        while (index < n) {
50            const left = 2 * index + 1;
51            const right = left + 1;
52
53            if (left < n && this.heap[left].priority < this.heap[min].priority) {
54                min = left;
55            }
56            if (right < n && this.heap[right].priority < this.heap[min].priority) {
57                min = right;
58            }
59            if (min === index) break;
60            [this.heap[min], this.heap[index]] = [this.heap[index], this.heap[min]];
61            index = min;
62        }
63    }
64
65    peek() {
66        return this.heap[0];
67    }
68
69    size() {
70        return this.heap.length;
71    }
72}
73
74function fun(arr) {
75    const heap = new MinHeap();
76    for (const x of arr) {
77        heap.push(new HeapItem(x));
78    }
79    const res = [];
80    for (let i = 0; i < 3; i++) {
81        res.push(heap.pop().item);
82    }
83    return res;
84}
85

Solution Approach

The solution to this problem utilizes a technique known as memorized recursion, which is a subset of dynamic programming. The idea is that we build our solution from the bottom up by solving all possible subproblems and storing their results to avoid redundant computation for overlapping subproblems.

Here's a breakdown of the implementation:

  • Memorized Recursion (dfs function): A depth-first search is performed recursively by the function dfs(i, j, k) which attempts to determine whether a substring of s1 starting at i and having a length k can be transformed into a substring of s2 starting at j of the same length (k). We use Python's @cache decorator to store the results of subproblems in memory, essentially creating a memoization table that the dfs function accesses before computing a new result. This significantly speeds up the algorithm by avoiding recomputation.

  • Subproblem Conditions: For each call to dfs(i, j, k), two cases are possible depending on whether the substrings are swapped or not:

    1. When there is no swap, we check if dfs(i, j, h) and dfs(i + h, j + h, k - h) return true for some split length h, meaning that the first h characters of the two substrings and the remaining k - h characters of the two substrings can make a scramble respectively.
    2. When there is a swap, we check if dfs(i, j + k - h, h) and dfs(i + h, j, k - h) return true for some split length h, meaning that the last h characters of the first substring and the first h characters of the second substring, and the remaining characters of both substrings can make a scramble, respectively.
  • Base Case: dfs includes a base case for when the length of the substring (k) is 1, at which point it simply compares the individual characters at the given indices i and j in s1 and s2, respectively.

  • Return Value: The overall result comes from the initial call to dfs(0, 0, len(s1)), which checks if the entire s1 can be scrambled to result in s2.

  • Complexity: The memoization search has a time complexity of O(n^4) and a space complexity of O(n^3), where n is the length of the strings. Time complexity is such because, in the worst case, we may need to check each possible split for each possible substring length k at every possible starting index i and j. Space complexity comes from the amount of information we need to store in our cache.

This recursive and memoization approach ensures we only compute what's necessary, making the problem solvable within a reasonable amount of time for larger strings.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

What is the best way of checking if an element exists in a sorted array once in terms of time complexity? Select the best that applies.

Example Walkthrough

Let's consider a small example to illustrate the solution approach. Suppose we have two strings s1 = "great" and s2 = "rgeat". We want to find out if s2 can be obtained from s1 by applying the scrambling operations described in the problem.

Step 1: Check if s1 can be scrambled to become s2 by calling dfs(0, 0, len(s1)). This is the top-level problem we need to solve.

Step 2: In our dfs(i, j, k) function, we start with i = 0, j = 0, and k = len("great") = 5. We need to check every possible split.

Step 3: Let's split s1 and s2 into h = 2 characters from the start and the remaining k - h = 3 characters. For s1 ("great"), the two parts are "gr" and "eat". For s2 ("rgeat"), the corresponding parts are "rg" and "eat".

In the no-swap case:

  • For the first 2 characters, a recursive call dfs(0, 0, 2) checks if "gr" can be scrambled into "rg", which is true. So "gr" can be transformed into "rg" by swapping the characters.
  • For the remaining 3 characters, a recursive call dfs(2, 2, 3) checks if "eat" can be scrambled into "eat", which is trivially true.

The recursive calls would return true since both parts can be made to match. Therefore, s2 is a valid scramble of s1 without even needing to check the swap case for this split.

Step 4: The algorithm would then proceed to check all other possible ways of splitting s1 and s2. However, since we have already found a valid scramble, the algorithm doesn't need to continue further for this example.

Step 5: After all the recursion and verification, if any of the recursive calls return true, we conclude that s2 is a scrambled version of s1. Since we found that dfs(0, 0, len(s1)) returns true, s2 = "rgeat" can indeed be made from s1 = "great".

Step 6: During the process, the results of various calls to dfs are stored in the memoization table. Should the same subproblems arise, the algorithm retrieves the result from the table instead of recalculating, effectively reducing computation time and complexity.

In summary, by breaking the problem into smaller subproblems, recursing on those, and memoizing the results, the algorithm can efficiently determine if one string is a scrambled version of the other.

Solution Implementation

1from functools import lru_cache
2
3class Solution:
4    def isScramble(self, s1: str, s2: str) -> bool:
5        # Decorator to cache the results of recursive calls
6        @lru_cache(maxsize=None)
7        def search_recursive(i: int, j: int, length: int) -> bool:
8            # Base case: if substring length is 1, check equality of chars
9            if length == 1:
10                return s1[i] == s2[j]
11
12            # Check for each possible split
13            for split_point in range(1, length):
14                # If a matching scramble is found with a split, return True
15                if (search_recursive(i, j, split_point) and
16                    search_recursive(i + split_point, j + split_point, length - split_point)):
17                    return True
18              
19                # Check for a scramble with a swapped second half
20                if (search_recursive(i + split_point, j, length - split_point) and
21                    search_recursive(i, j + length - split_point, split_point)):
22                    return True
23          
24            # If no scramble can be formed, return False
25            return False
26
27        # Initial call for the recursive search function
28        return search_recursive(0, 0, len(s1))
29
30# Example usage
31# sol = Solution()
32# result = sol.isScramble("great", "rgeat") # Should return True
33# print(result)
34
1class Solution {
2    private Boolean[][][] memoizationCache; // A 3D memoization cache to store the results of subproblems
3    private String str1; // First string to compare
4    private String str2; // Second string to compare
5
6    // Determines if s2 is a scrambled string of s1
7    public boolean isScramble(String s1, String s2) {
8        int length = s1.length();
9        this.str1 = s1;
10        this.str2 = s2;
11        memoizationCache = new Boolean[length][length][length + 1]; // Initialize memoization cache
12        return dfs(0, 0, length); // Launch the depth-first search starting with the full length
13    }
14
15    // Executes the depth-first search to see if two substrings are scrambled equivalents
16    private boolean dfs(int index1, int index2, int length) {
17        if (memoizationCache[index1][index2][length] != null) {
18            // If result is already computed for this subproblem, return the result
19            return memoizationCache[index1][index2][length];
20        }
21        if (length == 1) {
22            // If the length to compare is 1, check if characters are equal
23            return str1.charAt(index1) == str2.charAt(index2);
24        }
25        for (int partition = 1; partition < length; ++partition) {
26            // Check if swapping the partition leads to a scramble that matches
27            if (dfs(index1, index2, partition) && dfs(index1 + partition, index2 + partition, length - partition)) {
28                // The first segment of str1 matches the first segment of str2 and the second segment of str1 matches the second segment of str2
29                return memoizationCache[index1][index2][length] = true;
30            }
31            // Check if non-swapped variant leads to a scramble that matches
32            if (dfs(index1 + partition, index2, length - partition) && dfs(index1, index2 + length - partition, partition)) {
33                // The first segment of str1 matches the second segment of str2 and the second segment of str1 matches the first segment of str2
34                return memoizationCache[index1][index2][length] = true;
35            }
36        }
37        // If none of the above conditions lead to a scramble that matches, then return false
38        return memoizationCache[index1][index2][length] = false;
39    }
40}
41
1#include <cstring>
2#include <functional>
3#include <string>
4
5using namespace std;
6
7class Solution {
8public:
9    // Function to determine if two strings are scramble strings
10    bool isScramble(string s1, string s2) {
11        int length = s1.size();
12        // Define a 3D array to store the states of substring scrambles
13        int scrambleStates[length][length][length + 1];
14        memset(scrambleStates, -1, sizeof(scrambleStates));
15      
16        // Recursive lambda function to check for scramble strings
17        function<bool(int, int, int)> checkScramble = [&](int index1, int index2, int len) -> bool {
18            if (scrambleStates[index1][index2][len] != -1) {
19                // Use memoization to return the previously computed result
20                return scrambleStates[index1][index2][len] == 1;
21            }
22            if (len == 1) {
23                // If the length is 1, just compare the characters
24                return s1[index1] == s2[index2];
25            }
26            for (int split = 1; split < len; ++split) {
27                // Check if the first split part is a scramble and the second split part is a scramble
28                if (checkScramble(index1, index2, split) && checkScramble(index1 + split, index2 + split, len - split)) {
29                    scrambleStates[index1][index2][len] = 1;
30                    return true;
31                }
32                // Check if swapping the split parts results in a scramble
33                if (checkScramble(index1 + split, index2, len - split) && checkScramble(index1, index2 + len - split, split)) {
34                    scrambleStates[index1][index2][len] = 1;
35                    return true;
36                }
37            }
38            scrambleStates[index1][index2][len] = 0;
39            return false;
40        };
41      
42        // Start the recursive check with the full length of the strings
43        return checkScramble(0, 0, length);
44    }
45};
46
1// A function to determine if two strings are scrambles of each other
2function isScramble(str1: string, str2: string): boolean {
3    const length = str1.length;
4
5    // The memoization table where -1 indicates uninitialized, 0 for false, and 1 for true
6    const memo = new Array(length)
7        .fill(0)
8        .map(() => new Array(length).fill(0).map(() => new Array(length + 1).fill(-1)));
9
10    // A recursive function to check if substrings are scrambles
11    // i is the start index of the substring in str1
12    // j is the start index of the substring in str2
13    // k is the length of the substrings
14    const checkScramble = (i: number, j: number, k: number): boolean => {
15
16        // If this subproblem has already been computed, return the result
17        if (memo[i][j][k] !== -1) {
18            return memo[i][j][k] === 1;
19        }
20
21        // If the length of the substrings is 1, we simply compare the characters
22        if (k === 1) {
23            return str1[i] === str2[j];
24        }
25
26        // Check for all possible splits
27        for (let h = 1; h < k; ++h) {
28            // If swapping the current parts make them equals, set result to true
29            if (checkScramble(i, j, h) && checkScramble(i + h, j + h, k - h)) {
30                return Boolean((memo[i][j][k] = 1));
31            }
32            // If swapping the non-corresponding parts make them equals, set result to true
33            if (checkScramble(i + h, j, k - h) && checkScramble(i, j + k - h, h)) {
34                return Boolean((memo[i][j][k] = 1));
35            }
36        }
37
38        // If none of the splits result in matching strings, set result to false
39        return Boolean((memo[i][j][k] = 0));
40    };
41
42    // Calling the recursive check with the start indices at 0 and full length of the strings
43    return checkScramble(0, 0, length);
44}
45
Not Sure What to Study? Take the 2-min Quiz

In a binary min heap, the maximum element can be found in:

Time and Space Complexity

The given Python solution leverages recursion with memoization to determine if one string is a scramble of another. Here's an analysis of its time and space complexities:

Time Complexity:

The dfs function is memoized and explores each possible split of the string segments once for every distinct pair of starting indices (i, j) and length k. Since i and j can each range from 0 to n-1 (for an n-character string) and k can range from 1 to n, there are O(n^2) pairs of starting indices and O(n) possible lengths for each pair. Thus, the time complexity can seem to be O(n^3) considering each possible (i, j, k) triplet is calculated once due to memoization.

However, within every call to dfs(i, j, k), there is a loop that runs k-1 times. Since k can be up to n, the loop can contribute at most O(n) operations per memoized function call. This loop is where the fourth dimension of the time complexity stems from, leading to a total time complexity of O(n^4).

Space Complexity:

The space complexity is primarily dictated by the memoization cache. The space needed for the cache relates directly to the number of distinct arguments passed to the dfs function, which, as explained, is the product of the different values i, j, and k can take. Since i and j range from 0 to n-1 and k ranges from 1 to n, the max number of distinct states stored in the cache is n * n * n, which is O(n^3).

Besides the cache, the space complexity also includes the space for the call stack due to recursion. In the worst case, the recursive calls could stack up to O(n) deep (the maximum possible depth of recursion is when k decreases by 1 each time). However, this does not affect the overall space complexity, which remains O(n^3) due to the dominating size of the memoization cache.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which type of traversal does breadth first search do?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

←
↑TA đŸ‘šâ€đŸ«