854. K-Similar Strings


Problem Description

In this problem, we are given two strings s1 and s2 that are anagrams of each other. This means s1 and s2 consist of the same letters in a possibly different order. The concept of k-similarity between the strings is introduced where k represents the minimum number of swaps between any two positions in string s1 needed to transform it into string s2.

The task is to find the smallest non-negative integer k such that after exactly k swaps of pairs of letters in s1, we obtain s2. Note that during each swap, the two letters to be swapped in s1 need not be adjacent. The challenge lies in determining the most efficient sequence of swaps that results in s1 becoming s2, using the fewest number of swaps possible.

Intuition

To solve this problem, we use a Breadth-First Search (BFS) approach. We think of each string that can be formed by swapping letters in s1 as a node in a graph, with an edge between two nodes if one can be transformed into the other by a single swap. The problem then becomes a shortest path problem, where we want to find the fewest number of edges (swaps) required to reach s2 starting from s1.

With BFS, we traverse the graph level by level. Each level of the traversal represents an incremental increase in the number of swaps (increment in k). At each step, we generate all possible strings that are one swap away from the current string while ensuring two conditions:

  • We only swap letters that would bring us closer to the string s2 (the letter being swapped in s1 must match the letter in the same position in s2 and not be in its final position in s1).
  • We haven't already visited the newly created string in our traversal; this avoids unnecessary work and ensures we are always heading in the direction of increasing k optimally.

We utilize a queue to maintain the strings at the current level of BFS and a set vis to keep track of visited strings to prevent revisiting them. As soon as we find s2 during our traversal, we have found the smallest k, which represents the length of the shortest path in our graph of strings, and we return this value.

The rationale for why BFS works well here is because it guarantees that the first time we reach s2, it's through the shortest path possible, hence the minimum number of swaps k.

Learn more about Breadth-First Search patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece:

Which data structure is used to implement priority queue?

Solution Approach

The solution makes use of the Breadth-First Search (BFS) algorithm, which is a standard approach for finding the shortest path in an unweighted graph. Here, the nodes of the graph are different permutations of the string s1, and the edges represent a valid swap that can transform one permutation into another.

In the provided Python code, this approach is implemented as follows:

  • BFS is carried out using a queue named q — a deque data structure that allows for the efficient addition and removal of elements from both ends. Initially, the queue contains only the starting string s1.

  • A set named vis stores all the visited strings to avoid repeated processing and to keep track of the unique permutations that have already been considered.

  • The function next generates all possible strings that can be reached from the current string s with a single swap that gets us closer to the target string s2. To do this, next finds the first index i where the letters in s and s2 differ. Then it looks for indices j beyond i where swapping the letter at position i with the letter at position j is beneficial, namely where:

    • s[j] is equal to s2[i], meaning that this letter is desired at position i to match s2.
    • s[j] is not already in its final position (s[j] != s2[j]), ensuring that the swap will contribute to making the strings more similar.
  • With the BFS traversal, the algorithm goes through all strings at the current level of similarity before increasing the value of ans, which represents the number of swaps k. Each outer while loop represents another level of depth in the search, hence another swap.

  • The inner loop iterates through all strings in the current level of the q, performing swaps via the next function to get the next level of permutations and checking if any of these permutations match s2.

  • When the s2 string is finally found, ans is returned, which by the nature of BFS, will be the minimum number of swaps required.

  • The loop terminates as soon as s2 is encountered because BFS guarantees that the first occurrence of s2 will have been reached via the shortest path (minimum number of swaps).

Through this approach, even though the problem space could be large, the algorithm ensures an efficient traversal by only considering useful swaps and by avoiding the revisitation of previously encountered strings.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Given an array of 1,000,000 integers that is almost sorted, except for 2 pairs of integers. Which algorithm is fastest for sorting the array?

Example Walkthrough

Let's consider a small example to illustrate the solution approach. Suppose we have s1 = "123" and s2 = "321". We need to find the minimum number of swaps in s1 to transform it into s2.

Beginning with the initial string s1 = "123", we enqueue this string into the queue q and mark "123" as visited in the set vis.

First Iteration

  • q = ["123"], vis = {"123"}, and ans = 0 (no swaps done yet).
  • We dequeue "123" and generate next possible strings.
  • We notice that s1[0] != s2[0], so we look for characters in s1 that match s2[0]. We find "3" at s1[2] matches s2[0].
  • However, swapping "1" and "3" in "123" yields "321", which is exactly s2. Since we've found s2, ans = 1 (one swap).
  • Our search ends here because we have transformed s1 into s2.

End of the process with ans = 1. The minimum number of swaps required is 1.

In this case, we found the answer quickly due to the simplicity of the example. In a more complex scenario, BFS would continue exploring further levels with more swaps until s2 is reached. This approach ensures the first time s2 is encountered during the BFS, it would be the result of the minimum number of swaps needed, hence we would return that as the smallest non-negative integer k.

Solution Implementation

1from collections import deque
2
3class Solution:
4    def kSimilarity(self, s1: str, s2: str) -> int:
5        # Helper function to generate the next state of strings that are one swap closer to s2
6        def generate_next_states(current_string):
7            i = 0
8            # Find the first position where characters differ
9            while current_string[i] == s2[i]:
10                i += 1
11
12            next_states = []
13            # Loop through the string to find swap positions
14            for j in range(i + 1, string_length):
15                # Swap only if it brings current_string closer to s2
16                if current_string[j] == s2[i] and current_string[j] != s2[j]:
17                    swapped_string = (current_string[:i] + current_string[j] + 
18                                      current_string[i+1:j] + current_string[i] + 
19                                      current_string[j+1:])
20                    next_states.append(swapped_string)
21            return next_states
22
23        # Initialize the queue with starting string s1
24        queue = deque([s1])
25        # Set to store visited strings
26        visited = set([s1])
27        # Number of swaps made
28        swaps = 0
29        string_length = len(s1)
30      
31        # Breadth-first search loop
32        while True:
33            queue_length = len(queue)
34            for _ in range(queue_length):
35                current_string = queue.popleft()
36                # If the current string equals s2, we've found the minimum swaps
37                if current_string == s2:
38                    return swaps
39                # Generate and iterate over next possible swap states
40                for next_state in generate_next_states(current_string):
41                    # Enqueue the next state if not visited
42                    if next_state not in visited:
43                        visited.add(next_state)
44                        queue.append(next_state)
45            # Increment the swap counter after each breadth level is processed
46            swaps += 1
47
1class Solution {
2  
3    // Method to find K similarity between strings s1 and s2. 
4    // It calculates the minimum number of adjacent swaps to make string s1 equal to s2
5    public int kSimilarity(String s1, String s2) {
6        Deque<String> queue = new ArrayDeque<>(); // Use a queue to store strings for BFS
7        Set<String> visited = new HashSet<>();    // Use a set to keep track of visited strings
8        queue.offer(s1);
9        visited.add(s1);
10        int stepCount = 0; // Initialize a step counter
11      
12        // Perform BFS until s1 can be transformed into s2
13        while (true) {
14            int queueSize = queue.size();
15            // Process all strings at the current step
16            for (int i = queueSize; i > 0; --i) {
17                String currentString = queue.pollFirst();
18                // If the current string matches s2, return the number of steps taken
19                if (currentString.equals(s2)) {
20                    return stepCount;
21                }
22                // Generate all possible next moves and add them to the queue if not visited
23                for (String nextString : generateNextStrings(currentString, s2)) {
24                    if (!visited.contains(nextString)) {
25                        visited.add(nextString);
26                        queue.offer(nextString);
27                    }
28                }
29            }
30            stepCount++; // Increment step count after each breadth of BFS
31        }
32    }
33
34    // Helper method to generate the next moves by swapping characters in s to move towards s2
35    private List<String> generateNextStrings(String s, String s2) {
36        int i = 0;
37        int length = s.length();
38        char[] chars = s.toCharArray();
39        // Find the first non-matching character position
40        while (chars[i] == s2.charAt(i)) {
41            i++;
42        }
43
44        List<String> nextMoves = new ArrayList<>();
45        // Generate all valid swaps to move characters closer to their final positions
46        for (int j = i + 1; j < length; ++j) {
47            // Swap if the character in 's' can be swapped to match the corresponding one in 's2'
48            // and it is not in its final position already
49            if (chars[j] == s2.charAt(i) && chars[j] != s2.charAt(j)) {
50                swap(chars, i, j); // Perform the swap
51                nextMoves.add(new String(chars)); // Add the new string after the swap
52                swap(chars, i, j); // Swap back for the next iteration
53            }
54        }
55        return nextMoves;
56    }
57
58    // Helper method to swap two characters in an array
59    private void swap(char[] chars, int i, int j) {
60        char temp = chars[i];
61        chars[i] = chars[j];
62        chars[j] = temp;
63    }
64}
65
1#include <string>
2#include <vector>
3#include <queue>
4#include <unordered_set>
5
6using namespace std;
7
8class Solution {
9public:
10    // Function that returns the minimum number of adjacent swaps required to transform
11    // string s1 into string s2 where the strings are k-similar.
12    int kSimilarity(string s1, string s2) {
13        queue<string> workQueue;     // Queue to hold strings to be processed
14        workQueue.push(s1);          // Starting with s1
15
16        unordered_set<string> visited; // Set to keep track of visited strings to avoid repetition
17        visited.insert(s1);
18
19        int steps = 0;  // Counter to track the number of swaps made
20
21        // Main processing loop
22        while (true) {
23            int levelSize = workQueue.size(); // Process every string in the current level
24            for (int i = 0; i < levelSize; ++i) {
25                auto currentString = workQueue.front();
26                workQueue.pop();
27
28                // If the current string is equal to s2, we have found the solution
29                if (currentString == s2) {
30                    return steps;
31                }
32
33                // Generate all possible strings that are one swap closer to s2
34                vector<string> neighbors = findNeighbors(currentString, s2);
35                for (auto& neighbor : neighbors) {
36                    // If we have not visited this string before, add it to the queue and mark it as visited
37                    if (visited.count(neighbor) == 0) {
38                        visited.insert(neighbor);
39                        workQueue.push(neighbor);
40                    }
41                }
42            }
43            steps++;  // Increment the number of swaps (levels in BFS) after processing the current level
44        }
45    }
46
47    // Helper function to generate all possible strings that are one swap closer to s2
48    vector<string> findNeighbors(string& currentString, string& target) {
49        int i = 0;
50        int n = currentString.size();
51      
52        // Find the first position where the characters differ between the two strings
53        for (; i < n && currentString[i] == target[i]; ++i) {}
54
55        vector<string> result;  // Vector to hold all the valid next strings
56
57        // Try swapping the character at index 'i' with each character 'j' that is different from target[j]
58        // and is the same as target[i], moving us one step closer to s2 with each swap
59        for (int j = i + 1; j < n; ++j) {
60            if (currentString[j] == target[i] && currentString[j] != target[j]) {
61                swap(currentString[i], currentString[j]);  // Execute the swap
62                result.push_back(currentString); // Add the new string to the result list
63                swap(currentString[i], currentString[j]);  // Swap back for next iteration
64            }
65        }
66        return result;
67    }
68};
69
1type StringQueue = string[]; // Alias for an array used as a queue
2type StringSet = Set<string>; // Alias for a set of strings
3
4let workQueue: StringQueue = []; // Queue to hold strings to be processed
5let visited: StringSet = new Set<string>(); // Set to keep track of visited strings to avoid repetition
6
7let steps: number = 0; // Counter to track the number of swaps made
8
9// Function that adds an element to the queue
10function enqueue(queue: StringQueue, element: string) {
11  queue.push(element);
12}
13
14// Function that removes and returns the first element from the queue
15function dequeue(queue: StringQueue): string | undefined {
16  return queue.shift();
17}
18
19// Function that returns the length of the queue
20function queueSize(queue: StringQueue): number {
21  return queue.length;
22}
23
24// Function that returns the minimum number of adjacent swaps required to transform
25// string s1 into string s2 where the strings are k-similar
26function kSimilarity(s1: string, s2: string): number {
27  enqueue(workQueue, s1); // Starting with s1
28  visited.add(s1);
29
30  while (true) {
31    let levelSize = queueSize(workQueue); // Process every string in the current level
32    for (let i = 0; i < levelSize; i++) {
33      let currentString = dequeue(workQueue);
34
35      if (currentString === s2) {
36        // If the current string is equal to s2, we have found the solution
37        return steps;
38      }
39
40      let neighbors = findNeighbors(currentString, s2);
41      for (let neighbor of neighbors) {
42        if (!visited.has(neighbor)) {
43          // If we have not visited this string before, add it to the queue and mark it as visited
44          visited.add(neighbor);
45          enqueue(workQueue, neighbor);
46        }
47      }
48    }
49    steps++; // Increment the number of swaps (levels in BFS) after processing the current level
50  }
51}
52
53// Helper function to generate all possible strings that are one swap closer to s2
54function findNeighbors(currentString: string, target: string): string[] {
55  let pos = 0;
56  const n = currentString.length;
57
58  // Find the first position where the characters differ between the two strings
59  while (pos < n && currentString.charAt(pos) === target.charAt(pos)) {
60    pos++;
61  }
62
63  let results: string[] = []; // Array to hold all the valid next strings
64
65  // Try swapping the character at index 'pos' with each character 'j'
66  // that is different from target[j] and is the same as target[pos],
67  // moving us one step closer to s2 with each swap
68  for (let j = pos + 1; j < n; j++) {
69    if (
70      currentString.charAt(j) === target.charAt(pos) &&
71      currentString.charAt(j) !== target.charAt(j)
72    ) {
73      let swappedString =
74        currentString.substring(0, pos) +
75        currentString.charAt(j) +
76        currentString.substring(pos + 1, j) +
77        currentString.charAt(pos) +
78        currentString.substring(j + 1);
79      results.push(swappedString); // Add the new string to the results array
80    }
81  }
82  return results;
83}
84
Not Sure What to Study? Take the 2-min Quiz:

Which of the following array represent a max heap?

Time and Space Complexity

The given Python code is solving the k-Similarity problem, which aims to determine the minimum number of adjacent swaps needed to transform one string into another, given that the strings are of the same length and consist of the same characters.

Time Complexity:

The time complexity of this algorithm primarily depends on the number of levels we traverse in the BFS and the number of branches we generate at each level.

  • Breadth-First Search (BFS) Traversal: The code uses a BFS traversal to explore all possibilities via swapping operations. The BFS traversal could, in the worst case, explore all permutations of the string s1, which would be O(n!) where n is the length of the string s1.
  • Branching Factor in BFS: At each level of BFS, the next() function can potentially generate multiple branches. This branching factor can be at most n, where n is the length of the strings since we could theoretically swap the first misaligned character with any of the other n - 1 remaining characters that match the required character in s2.
  • Combining the above two, a loose upper bound for the time complexity of the BFS traversal with branching is O(n * n!), but this is a very pessimistic upper bound as it assumes we explore all permutations of s1, and in each step, we have the maximum amount of valid swaps.

However, the BFS guarantees that we find the shortest path, so we will likely find a solution before exploring all permutations. We can say that the average time complexity is better than this upper bound but it's hard to quantify without more information about the properties of the specific strings s1 and s2.

Space Complexity:

The space complexity is influenced by two factors:

  • Queue Storage (q): The maximum size of the queue could theoretically approach the number of all possible permutations in the worst case, which is O(n!).
  • Visited Set (vis): The visited set can hold a unique entry for each visited permutation, which, in the worst case, can also be O(n!).

Therefore, the worst-case space complexity of the algorithm is O(n!). This will likely be significantly less in practice due to the pruning of the BFS search space, meaning we do not typically visit all permutations.

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which of the following is a good use case for backtracking?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.


TA 👨‍🏫