2977. Minimum Cost to Convert String II


Problem Description

In this problem, we are given two strings, source and target, both of 0-indexed length n comprising lowercase English characters. We also have two arrays of strings, original and changed, and an array of integers cost. Each element within the cost array stands for the expense of altering a string in original to its corresponding string in changed.

The challenge lies in transforming source into target at the minimum possible overall cost, adhering to certain transformation rules. We may replace any substring from source with a new substring (if there's a matching transformation in original, changed, and corresponding cost). The rules for performing transformations are:

  1. The substrings selected for two distinct operations should not overlap unless they are precisely the same.
  2. The substrings must either be disjoint or identical.

Our goal is to ascertain the lowest price to turn source into target. It may be that no sequence of operations can achieve the transformation, in which case we must return -1.

Intuition

The core intuition for solving this problem is to treat the problem as a graph problem where strings represent nodes and conversion costs represent edges. We begin by creating a Trie structure to efficiently store all the source and destination strings (original and changed) as nodes with unique integer identifiers. This helps us to quickly look up any substring's corresponding transformation node.

We also initialize a graph, represented as a two-dimensional array g, keeping track of the minimum cost to convert each possible string original[i] to changed[i]. This graph will be used in applying the Floyd-Warshall algorithm to find out the shortest path, or in our case, the minimum cost to transform any string to another given all the intermediate conversion possibilities provided by original and changed.

Next comes the Floyd-Warshall algorithm—the classic algorithm for finding shortest paths in a weighted graph with positive or negative edge weights. Here the algorithm helps precompute the minimum cost to convert any original[i] to changed[i], considering intermediate conversions that might lower the total cost.

Once we have precomputed the conversion costs between any two substrings, we use Depth-First Search (DFS) along with memoization to recursively find the minimum cost needed to transform the source into the target, starting from the first character. Memoization helps to store the results of subproblems and avoid redundancy, speeding up the computation.

The DFS function works by comparing the current characters of source and target. If they are equal, we can continue to the next character without any cost. If they differ, DFS explores multiple scenarios where the current substring of source is converted to all possible substrings of target, each with their associated cost, and then adds the result of the recursive call for the rest of the string. The minimum cost out of these scenarios is selected.

Finally, it all boils down to calling our DFS starting from the beginning of the source string and using the precomputed graph to efficiently manage transformations, with the minimum cost being retrieved from the memoized search.

Learn more about Graph, Trie, Dynamic Programming and Shortest Path patterns.

Not Sure What to Study? Take the 2-min Quiz to Find Your Missing Piece

Which data structure is used to implement priority queue?

Solution Approach

The solution to this problem leverages a combination of data structures and algorithms to achieve its goal. Here's a step-by-step breakdown of the approach:

  1. Trie Construction: We utilize a Trie to store all the original and changed strings. A Trie is an efficient data structure for storing a dynamic set of strings, which allows for quick lookup operations. In this case, the Trie helps us map every original string to a unique integer identifier and, similarly, every changed string to another identifier.

  2. Graph Initialization: We then initialize a graph g as a two-dimensional array with dimensions m x m, where m is the count of all unique strings in the Trie. The graph contains information about the conversion costs, initialized to infinity (inf), except for self-transitions, which have a cost of 0.

  3. Floyd-Warshall Algorithm: This classic algorithm is used to compute the shortest paths between all pairs of vertices in the graph. Applied here, it calculates the minimum cost to convert any string i to string j, factoring in the possibility of intermediate string transformations that could reduce the total cost. The update rule for Floyd-Warshall is often formulated as:

    1`g[i][j] = min(g[i][j], g[i][k] + g[k][j])`

    This implies that the direct cost to convert from i to j is compared with the cost to convert from i to k and then from k to j, for each intermediate node k. We update the cost if the latter path is cheaper.

  4. Depth-First Search with Memoization: We define function dfs(i) to find the minimum cost of converting source[i..] to target[i..], starting from index i. If at any step source[i] equals target[i], no cost incurs and the DFS continues to the next index. When they differ, we explore all possible conversions of substrings starting from i, considering the precomputed conversion costs. The function dfs utilizes memoization to store results of subproblems to avoid repeating calculations.

  5. Enumerating Substring Conversions: During the DFS, we consider every substring of source starting at index i: source[i..j] for all j >= i. For each substring, we find corresponding node identifiers in the Trie—single quotes denote code. The conversion cost from the current source substring to the analogous target substring is g[p.v][q.v], where p and q represent the nodes in the Trie for the source and target substrings. We then combine this cost with the at dfs(j + 1), the DFS for the remaining substring of source starting at j + 1.

  6. Final Answer: The minimal cost to transform the entire source to target is given by calling dfs(0), the DFS starting from the first index of source. If the resulting cost is infinite (inf), it means there is no way to fully convert source to target, and thus we return -1. Otherwise, we return the computed minimum cost.

Put simply, the solution consolidates the precalculated costs of substring conversions and deploys DFS to recursively find the minimum cost from source to target, significantly aided by the memoization technique to optimize the recursion.

Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:

Consider the classic dynamic programming of longest increasing subsequence:

Find the length of the longest subsequence of a given sequence such that all elements of the subsequence are sorted in increasing order.

For example, the length of LIS for [50, 3, 10, 7, 40, 80] is 4 and LIS is [3, 7, 40, 80].

What is the recurrence relation?

Example Walkthrough

Let's walk through a small example to illustrate the solution approach. Suppose we have the following inputs:

  • source = "abcd"
  • target = "bdce"
  • original = ["a", "bc", "d"]
  • changed = ["b", "f", "e"]
  • cost = [1, 2, 1]

Constructing the Trie (Step 1)

We begin by inserting all original and changed strings into a Trie to efficiently organize the transformations. We'll have nodes with identifiers (for simplicity, a, bc, and d will have identifiers 1, 2, 3, and b, f, and e will have identifiers 4, 5, 6, respectively).

Initializing the Graph (Step 2)

Next, we create a graph g with dimensions 6 x 6 as there are six unique strings (nodes) in the Trie.

We initialize the graph with:

  • g[i][i] = 0 for all i representative of self-conversions which have no cost.
  • g[i][j] = inf for other elements, as we have not yet established conversion costs.

Updating the graph with the given costs gives us:

  • g[1][4] = 1 (cost of changing "a" to "b")
  • g[2][5] = 2 (cost of changing "bc" to "f")
  • g[3][6] = 1 (cost of changing "d" to "e")

Applying Floyd-Warshall (Step 3)

We execute the Floyd-Warshall algorithm to precompute the minimum costs for converting any original string to changed accounting for intermediate transformations.

Since our graph already reflects the direct costs and there are no cheaper intermediate nodes to connect them (the graph has no additional edges), the graph remains unchanged after Floyd-Warshall in this small example.

Depth-First Search (DFS) with Memoization (Step 4 & 5)

We call the dfs function starting at index 0. Let's examine possible conversions at each step:

  • dfs(0): "abcd" -> "bdce"

    • Converting "a" (source[0]) to "b" has a cost of 1 (g[1][4]).
    • After this operation, "bcd" must be converted to "dce".
    • Call dfs(1), which will process "bcd".
  • dfs(1): "bcd" -> "dce"

    • Converting "bc" to "f" does not help us in this case as "f" is not a substring of "dce".
    • Converting "b" to "d" is not possible with our given transformations.
    • Therefore, we look at converting "c" ("b" is skipped as there is no change that matches).
    • There is no direct conversion available from "c" to "d", we skip to the next character.
  • dfs(2): "d" -> "ce"

    • Converting "d" to "e" has a cost of 1 (g[3][6]).
    • After this operation, nothing is left from "source", but we have an extra "c" in the target that cannot be achieved, so this path is not possible.

Since none of the transformations satisfy the complete conversion from "abcd" to "bdce", we reach a dead-end.

Results (Step 6)

After exploring all possible conversions via DFS, we see that:

  • The cost of transforming "a" to "b" is 1.
  • There is no cost-effective way to transform "bcd" to "dce", leading to an infinite cost.

Hence, the result is -1, indicating that it is not possible to transform source to target using the given transformations and costs.

Solution Implementation

1from typing import List
2from functools import cache
3from math import inf
4
5
6class Node:
7    __slots__ = ["children", "value"]
8
9    def __init__(self):
10        # Each Node has up to 26 children for each letter of the alphabet and a value.
11        self.children: List[Node | None] = [None] * 26
12        self.value = -1
13
14
15class Solution:
16    def minimumCost(self, source: str, target: str, original: List[str], changed: List[str], cost: List[int]) -> int:
17        m = len(cost)
18        # Initialize a graph with infinite edge costs.
19        graph = [[inf] * (m << 1) for _ in range(m << 1)]
20      
21        for i in range(m << 1):
22            graph[i][i] = 0
23          
24        root = Node()
25        index = 0
26
27        def insert(word: str) -> int:
28            # Insert a word into the trie and return its index.
29            node = root
30            for char in word:
31                i = ord(char) - ord('a')
32                if node.children[i] is None:
33                    node.children[i] = Node()
34                node = node.children[i]
35            if node.value < 0:
36                nonlocal index
37                node.value = index
38                index += 1
39            return node.value
40
41        @cache
42        def dfs(i: int) -> int:
43            # Depth-first search to find the minimum cost to transform source[i:] to target[i:].
44            if i >= len(source):
45                return 0
46            cost_if_same = dfs(i + 1) if source[i] == target[i] else inf
47          
48            p = q = root
49            minimum_cost = cost_if_same
50          
51            for j in range(i, len(source)):
52                if p is None or q is None:
53                    break
54                p = p.children[ord(source[j]) - ord('a')]
55                q = q.children[ord(target[j]) - ord('a')]
56              
57                if p is None or q is None or p.value < 0 or q.value < 0:
58                    continue
59              
60                minimum_cost = min(minimum_cost, dfs(j + 1) + graph[p.value][q.value])
61            return minimum_cost
62
63        # Build the initial graph using original and changed words, and their associated cost.
64        for x, y, z in zip(original, changed, cost):
65            x = insert(x)
66            y = insert(y)
67            graph[x][y] = min(graph[x][y], z)
68          
69        # Use Floyd–Warshall algorithm to find the shortest paths in graph.
70        for k in range(index):
71            for i in range(index):
72                if graph[i][k] >= inf:  # Skip if the edge does not exist.
73                    continue
74                for j in range(index):
75                    if graph[i][k] + graph[k][j] < graph[i][j]:
76                        graph[i][j] = graph[i][k] + graph[k][j]
77
78        # Calculate the minimum cost to transform the entire source to the target.
79        answer = dfs(0)
80      
81        # Return -1 if it's not possible or the answer if it's valid.
82        return -1 if answer >= inf else answer
83
1import java.util.Arrays;
2
3class Node {
4    Node[] children = new Node[26];
5    int valueIndex = -1; // Initialize with -1 to indicate not assigned.
6}
7
8class Solution {
9    private final long INF = 1L << 60; // A large value to represent infinity in the context.
10    private Node root = new Node();    // Trie root node.
11    private int index;                 // Tracks indices for the nodes in the trie.
12
13    private long[][] graph; // Represents the cost graph between trie node indices.
14    private char[] sourceArray; // Source string to character array.
15    private char[] targetArray; // Target string to character array.
16    private Long[] dpMemo;      // Dynamic programming memoization table.
17
18    // Computes the minimum cost to convert the `source` string to the `target` string.
19    public long minimumCost(
20        String source, String target, String[] originals, String[] changeds, int[] costs) {
21        int transformationCount = costs.length;
22        graph = new long[transformationCount << 1][transformationCount << 1];
23        sourceArray = source.toCharArray();
24        targetArray = target.toCharArray();
25
26        // Initialize graph with INF and 0 on the diagonal (no cost to convert to itself).
27        for (int i = 0; i < graph.length; ++i) {
28            Arrays.fill(graph[i], INF);
29            graph[i][i] = 0;
30        }
31
32        // Build the trie and graph with minimum costs between conversions.
33        for (int i = 0; i < transformationCount; ++i) {
34            int startIndex = insertIntoTrie(originals[i]);
35            int endIndex = insertIntoTrie(changeds[i]);
36            graph[startIndex][endIndex] = Math.min(graph[startIndex][endIndex], costs[i]);
37        }
38
39        // Floyd-Warshall algorithm to find all-pairs shortest paths in the graph.
40        for (int k = 0; k < index; ++k) {
41            for (int i = 0; i < index; ++i) {
42                if (graph[i][k] >= INF) {
43                    continue; // Skip if there is no path from i to k.
44                }
45                for (int j = 0; j < index; ++j) {
46                    graph[i][j] = Math.min(graph[i][j], graph[i][k] + graph[k][j]);
47                }
48            }
49        }
50
51        // Initialize dynamic programming memoization table.
52        dpMemo = new Long[sourceArray.length];
53        // Start the recursive depth-first search from index 0.
54        long answer = dfs(0);
55        // If the result is INF, no valid conversion exists; return -1.
56        return answer >= INF ? -1 : answer;
57    }
58
59    // Inserts a string into the trie and returns its index (valueIndex).
60    private int insertIntoTrie(String word) {
61        Node node = root;
62        for (char c : word.toCharArray()) {
63            int index = c - 'a';
64            if (node.children[index] == null) {
65                node.children[index] = new Node();
66            }
67            node = node.children[index];
68        }
69        if (node.valueIndex < 0) {
70            node.valueIndex = index++; // Assign index if not already assigned.
71        }
72        return node.valueIndex;
73    }
74
75    // Recursive depth-first search function to compute minimum cost.
76    private long dfs(int currentIndex) {
77        // Base case: If we've reached the end of the source array, no cost needed.
78        if (currentIndex >= sourceArray.length) {
79            return 0;
80        }
81        // Retrieve precomputed result from memo table if it exists.
82        if (dpMemo[currentIndex] != null) {
83            return dpMemo[currentIndex];
84        }
85        // Compare characters at the current index of source and target strings.
86        long result = sourceArray[currentIndex] == targetArray[currentIndex] ? dfs(currentIndex + 1) : INF;
87
88        // Search for the cheapest conversion from currentIndex to the end of strings.
89        Node sourceNode = root, targetNode = root;
90        for (int j = currentIndex; j < sourceArray.length; ++j) {
91            sourceNode = sourceNode.children[sourceArray[j] - 'a'];
92            targetNode = targetNode.children[targetArray[j] - 'a'];
93            if (sourceNode == null || targetNode == null) {
94                break; // No possible conversion if either nodes are null.
95            }
96            if (sourceNode.valueIndex < 0 || targetNode.valueIndex < 0) {
97                continue; // If no index assigned to nodes, skip.
98            }
99            long transformationCost = graph[sourceNode.valueIndex][targetNode.valueIndex];
100            if (transformationCost < INF) {
101                result = Math.min(result, transformationCost + dfs(j + 1));
102            }
103        }
104        // Store the result in memoization table and return.
105        return dpMemo[currentIndex] = result;
106    }
107}
108
1class Node {
2public:
3    Node* children[26]; // Pointers to child Nodes, one for each letter in the alphabet
4    int value = -1;     // Unique identifier for the node, -1 means uninitialized
5    Node() {
6        fill(begin(children), end(children), nullptr);
7    }
8};
9
10class Solution {
11private:
12    const long long kInf = 1LL << 60; // Represents an unreachable state or infinity
13    Node* root = new Node();           // Root of the trie
14    int index;                         // Index to assign unique IDs to trie nodes
15
16    vector<vector<long long>> graph;   // Representation of the transformation costs as a graph
17    string sourceString;               // Source string to transform from
18    string targetString;               // Target string to transform to
19    vector<long long> memoizationCache; // Memoization cache to store results of subproblems
20
21public:
22    long long minimumCost(string source, string target, vector<string>& original, vector<string>& changed, vector<int>& cost) {
23        int modificationCount = cost.size();
24        graph = vector<vector<long long>>(modificationCount << 1, vector<long long>(modificationCount << 1, kInf));
25        sourceString = source;
26        targetString = target;
27
28        // Set self-transformation cost to 0
29        for (int i = 0; i < graph.size(); ++i) {
30            graph[i][i] = 0;
31        }
32
33        // Build the graph with transformation costs
34        for (int i = 0; i < modificationCount; ++i) {
35            int fromIndex = insert(original[i]);
36            int toIndex = insert(changed[i]);
37            graph[fromIndex][toIndex] = min(graph[fromIndex][toIndex], static_cast<long long>(cost[i]));
38        }
39
40        // Floyd-Warshall algorithm to find all pairs shortest paths
41        for (int k = 0; k < index; ++k) {
42            for (int i = 0; i < index; ++i) {
43                if (graph[i][k] >= kInf) continue; // Skip over infinite costs
44                for (int j = 0; j < index; ++j) {
45                    graph[i][j] = min(graph[i][j], graph[i][k] + graph[k][j]);
46                }
47            }
48        }
49
50        // Initialize memoization cache for DFS
51        memoizationCache = vector<long long>(sourceString.length(), -1);
52        long long answer = depthFirstSearch(0);
53        // Return -1 if it is not possible to transform source to target string within given costs
54        return answer >= kInf ? -1 : answer;
55    }
56
57private:
58    // Inserts a word into the trie and returns its index
59    int insert(const string& word) {
60        Node* node = root;
61        for (char c : word) {
62            int i = c - 'a';
63            if (node->children[i] == nullptr) {
64                node->children[i] = new Node();
65            }
66            node = node->children[i];
67        }
68        if (node->value < 0) {
69            node->value = index++;
70        }
71        return node->value;
72    }
73
74    // DFS to compute the minimum cost of transforming the source substring starting at index `i`
75    long long depthFirstSearch(int i) {
76        if (i >= sourceString.length()) {
77            return 0;
78        }
79        if (memoizationCache[i] != -1) {
80            return memoizationCache[i];
81        }
82        long long result = (sourceString[i] == targetString[i]) ? depthFirstSearch(i + 1) : kInf;
83        Node* pNode = root;
84        Node* qNode = root;
85        for (int j = i; j < sourceString.length(); ++j) {
86            pNode = pNode->children[sourceString[j] - 'a'];
87            qNode = qNode->children[targetString[j] - 'a'];
88            if (pNode == nullptr || qNode == nullptr) {
89                break;
90            }
91            if (pNode->value < 0 || qNode->value < 0) {
92                continue;
93            }
94            long long tempCost = graph[pNode->value][qNode->value];
95            if (tempCost < kInf) {
96                result = min(result, tempCost + depthFirstSearch(j + 1));
97            }
98        }
99        return memoizationCache[i] = result;
100    }
101};
102
1// Declaration for the Node type
2type Node = {
3    children: (Node | null)[];
4    value: number;
5};
6
7// Function to create a new Node with initialized values
8const createNode = (): Node => ({
9    children: Array(26).fill(null),
10    value: -1
11});
12
13// The global "root" node of the trie
14const root: Node = createNode();
15
16// Mapping from index to nodes
17let index: number = 0;
18
19// Memoization array for the DFS
20const memo: number[] = Array(100).fill(-1);
21
22// Function to insert a word into the trie and return the corresponding node value
23const insertWord = (word: string): number => {
24    let node: Node = root;
25    for (const char of word) {
26        const charIndex: number = char.charCodeAt(0) - 'a'.charCodeAt(0);
27        if (node.children[charIndex] === null) {
28            node.children[charIndex] = createNode();
29        }
30        node = node.children[charIndex]!;
31    }
32    if (node.value < 0) {
33        node.value = index++;
34    }
35    return node.value;
36};
37
38// The main minimumCost function definition
39function minimumCost(
40    source: string,
41    target: string,
42    original: string[],
43    changed: string[],
44    cost: number[],
45): number {
46    const originalLength = cost.length;
47    const sourceLength = source.length;
48    const graph: number[][] = Array.from({ length: originalLength << 1 }, () => Array(originalLength << 1).fill(Infinity));
49
50    // Fill the graph based on original and changed strings and their associated costs
51    for (let i = 0; i < originalLength; ++i) {
52        const fromNode: number = insertWord(original[i]);
53        const toNode: number = insertWord(changed[i]);
54        graph[fromNode][toNode] = Math.min(graph[fromNode][toNode], cost[i]);
55    }
56
57    // Apply the Floyd-Warshall algorithm to find all pairs shortest paths
58    for (let k = 0; k < index; ++k) {
59        for (let i = 0; i < index; ++i) {
60            for (let j = 0; j < index; ++j) {
61                graph[i][j] = Math.min(graph[i][j], graph[i][k] + graph[k][j]);
62            }
63        }
64    }
65
66    // The DFS function to be used in the minimumCost function
67    const depthFirstSearch = (i: number): number => {
68        if (i >= sourceLength) {
69            return 0;
70        }
71        if (memo[i] !== -1) {
72            return memo[i];
73        }
74        let result: number = source[i] === target[i] ? depthFirstSearch(i + 1) : Infinity;
75        let p: Node = root;
76        let q: Node = root;
77        for (let j = i; j < source.length; ++j) {
78            p = p.children[source[j].charCodeAt(0) - 'a'.charCodeAt(0)]!;
79            q = q.children[target[j].charCodeAt(0) - 'a'.charCodeAt(0)]!;
80            if (!p || !q) {
81                break;
82            }
83            if (p.value < 0 || q.value < 0) {
84                continue;
85            }
86            const t: number = graph[p.value][q.value];
87            result = Math.min(result, t + depthFirstSearch(j + 1));
88        }
89        return (memo[i] = result);
90    };
91
92    // Start the DFS from the first character
93    const answer: number = depthFirstSearch(0);
94
95    // Return the minimum cost or -1 if no valid transformation is possible
96    return answer >= Infinity ? -1 : answer;
97}
98
Not Sure What to Study? Take the 2-min Quiz

Which of the following is a min heap?

Time and Space Complexity

Time Complexity

The time complexity of this code involves multiple components:

  1. Initializing a 2D array g with dimensions m << 1 by m << 1 (which means 2*m by 2*m, since << 1 is a left bit shift equivalent to multiplying by 2), where m is the length of the arrays original, changed, and cost. This operation is O(m^2) since it requires initializing an m * m matrix.

  2. Inserting strings from original and changed into trie nodes. Insertion of a word into a trie has a complexity of O(k), where k is the length of the word. Since every original and changed string is inserted once, and there are m such strings, the complexity for this part becomes O(m * k), where k is the average length of the words in original and changed. However, in this analysis, k seems to be considered constant, thus this step is included in O(m).

  3. Floyd-Warshall algorithm is used to compute the shortest path, which is O(m^3) as it involves three nested loops each running for all the nodes in the graph.

  4. The dfs function is a recursive function with memoization that explores multiple paths in the source and target strings. This complexity is harder to analyze directly but a rough upper bound can be O(n^2) where n is the length of the source and target, noting that due to memoization and early termination it may often be much less.

  5. The double for-loop within the dfs function seems to contribute an additional O(m * n) since it iterates through two pointers in source and target between i and len(source) while taking m into account through the checks on p.v and q.v.

Overall, combining these operations, the time complexity is O(m^3 + n^2 + m * n).

Space Complexity

The space complexity is determined by:

  1. The space used by the memoization cache in the dfs function, which can go up to O(n).

  2. The 2D array g of size (m << 1) * (m << 1) contributes O(m^2).

  3. The space taken by the trie to store up to 2*m nodes (m from original and m from changed), each containing an array of 26 children pointers (due to the English lowercase alphabet), is O(26*m) which simplifies to O(m).

  4. Additional space of O(n) for the call stack due to recursion in the dfs function.

Combining these contributions, the space complexity is O(m^2 + m * n + n).

Learn more about how to find time and space complexity quickly using problem constraints.

Fast Track Your Learning with Our Quick Skills Quiz:

Which type of traversal does breadth first search do?


Recommended Readings


Got a question? Ask the Teaching Assistant anything you don't understand.

Still not clear? Ask in the Forum,  Discord or Submit the part you don't understand to our editors.

←
↑TA đŸ‘šâ€đŸ«