2060. Check if an Original String Exists Given Two Encoded Strings
Problem Description
The LeetCode problem involves finding whether two given encoded strings could have originated from the same raw unencoded string. An original string is composed of lowercase English letters and can be encoded by following a procedure that involves splitting the string into a sequence of non-empty substrings, replacing chosen elements with their length, and then concatenating the sequence to form the encoded string.
For instance, the string "abcdefghijklmnop"
could be split into ["ab", "cdefghijklmn", "o", "p"]
, and by replacing some elements with their lengths, we might get ["ab", "12", "1", "p"]
, which when concatenated, gives the encoded string "ab121p"
.
The challenge with this problem is to discern whether we can backtrack from two given encoded strings, which include digits (representing the lengths of substrings in the original string) alongside letters, and determine if it is possible that they were encoded from the same original string.
This task requires a method to compare segments of the encoded strings and the numerical values, taking into account different possible segmentations of the original string and the ambiguity of numerical representation — a digit sequence like '123' can represent a single segment of length 123 or a sequence of segments with lengths 1, 2, and 3, for example.
Intuition
For this problem, dynamic programming is a natural approach to tackle the complexity of matching parts of two encoded strings. The intuition is to process each string character by character, and at each step, consider all possible encodings that could have led to the current segments of both strings.
A 2D dynamic programming array dp
is utilized where dp[i][j]
is a set of integers representing the possible differences in lengths between the first i
characters of s1
and the first j
characters of s2
. A positive number means s2
is longer, while a negative means s1
is longer. If zero is in the set, it means the substrings could match exactly up to that point.
We iterate over each character of s1
and s2
, and at each iteration, we try to extend the matching in three different ways:
-
If we encounter a digit in
s1
ors2
, we form numbers by concatenating up to three consecutive digits (since the problem ensures there won't be more than three consecutive digits) and adjust the length difference accordingly. -
If the current character in
s1
ors2
is a digit and the other string has a letter at the corresponding position, we assume that the digit is encoding a segment of letters and update the length difference by adding or subtracting 1, respectively. -
If we have a letter on both strings and the current length difference is zero, meaning we are aligned, and the letters match, we continue to the next position keeping the length difference zero.
The recursive relationships and the usage of a set to maintain possible length differences at each step allow us to keep track of all viable ways in which the strings could be aligned or misaligned.
In the end, we check if there's a zero in dp[n][m]
set because this implies that there exists a way to segment and match the strings such that they could originate from the same encoded string.
Learn more about Dynamic Programming patterns.
Solution Approach
The solution follows a dynamic programming approach to handle the complexity of varying segment lengths and the ambiguity inherent in the encoded strings. Here's how the provided TypeScript solution implements this strategy.
Data Structures Used:
- A two-dimensional dynamic programming array
dp
, where each elementdp[i][j]
is a set of integers. This set represents all the possible differences in lengths (deltas) between the firsti
characters ofs1
and the firstj
characters ofs2
.
Algorithms/Patterns Used:
-
Initializing the DP Table: The dp array is initialized so that
dp[0][0]
contains only the value 0, meaning that there's no difference in length when no characters froms1
ors2
have been considered. -
Nested Loops for DP Table Filling: There are two nested loops—one for index
i
(which goes throughs1
) and one for indexj
(which goes throughs2
). These loops iterate through each character position in both strings. -
Handling Digits and Characters:
- If a digit is found in
s1
, and the current delta is non-positive (meanings1
is not longer thans2
), we keep track of any number formed by up to three consecutive digits and add this number to the delta, storing the new delta in the appropriatedp
cell. - The same process applies for
s2
, except we subtract the number from the delta if a digit is encountered and the current delta is non-negative.
- If a digit is found in
-
Updating Deltas When Matching Digits with Characters:
- If a digit in
s1
corresponds to a character ins2
(when the delta is negative), we increment the delta by 1, and store it in thedp
index that corresponds to moving one character forward ins1
. - Similarly, if a digit in
s2
corresponds to a character ins1
(when the delta is positive), we decrement the delta by 1, and store it in thedp
index that corresponds to moving one character forward ins2
.
- If a digit in
-
Matching Characters:
- When both
s1[i]
ands2[j]
are characters and the deltas are zero, we check if the characters match. If they do, we carry over the zero delta to the next indices, meaning the match can continue.
- When both
-
Checking for a Match: After filling in the DP table, we check
dp[n][m]
for a zero value. If zero is present, it indicates that there is at least one valid way to segment the original string to match both encoded strings.
Helper Functions:
isDigit
: A small helper function that checks if a given character is a digit using a regular expression.
The provided solution utilizes these patterns systematically to account for every possible segmentation and encoding that could result from the original string, ensuring that all cases are covered. The dynamic programming set construction avoids recalculating possibilities for each substring pair, thus optimizing the solution.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's consider two encoded strings s1 = "a3b"
and s2 = "4ab"
and walk through the solution approach to understand if they could have been encoded from the same original string.
-
Initialize the DP table: We set
dp[0][0]
to contain{0}
as initially, there are no characters considered, so the difference in lengths is zero. -
First Iteration:
i = 0
,j = 0
:s1[i]
is'a'
ands2[j]
is'4'
. Sinces1[i]
is a character ands2[j]
is a digit, and the current set atdp[i][j]
is{0}
, we can't directly compare them. We interpret'4'
as the length of a segment, which meanss2
is implied to be longer by 4 characters. So we incrementj
to process the next character ins2
, updatingdp[i][j+1]
to{4}
.
-
Second Iteration:
i = 0
,j = 1
: Now,s2[j]
is'a'
, and the delta fromdp[i][j]
is{4}
. We can match the characters only if the delta were zero. However, we have another digit'3'
ins1
. Since we interpret this digit as the length of a segment and we have a delta of 4, we can match three characters ofs1
with three characters of the length ins2
, reducing the delta by 3. We updatedp[i+1][j]
(dp[1][1]
) to{1}
which now accounts for the remaining single character difference that has not been matched.
-
Third Iteration:
i = 1
,j = 1
: Withs1[i] = '3'
ands2[j] = 'a'
, the delta is{1}
atdp[1][1]
. Since the delta is positive, we can match onea
character froms2
with a supposed single character froms1
encoded as '3'. The delta becomes 0 because one character is matched from boths1
ands2
. We updatedp[i][j+1]
(dp[1][2]
) to{0}
.
-
Fourth Iteration:
i = 1
,j = 2
: Now, boths1[i]
ands2[j]
represent characters,'3'
and'b'
respectively, with a delta of{0}
. Since'3'
is not a character, this iteration doesn't directly involve a character match. Because'3'
ins1
may represent several characters encoded, we assume it represents at least one letter and decrement it by 1, assuming we're matching it with'b'
ins2
, which incrementsi
by 1. We add{1}
todp[2][3]
to represent this possible scenario.
-
Fifth Iteration:
i = 2
,j = 3
: In this step,s1[i]
isb
whiles2
is at the end. We have a set{1}
atdp[2][3]
, which implies an unmatched character ins1
, so we cannot proceed further.
Upon completion of iterating through the characters:
- We check
dp[n][m]
(dp[3][4]
in this case) for the presence of zero. A zero would indicate there is a matching segmentation of original string fors1
ands2
. Since there's no such value in the last cell, the given stringss1
ands2
cannot be matched, and hence, it is not possible that they were encoded from the same original string.
Solution Implementation
1def possibly_equals(s1: str, s2: str) -> bool:
2 length_s1 = len(s1)
3 length_s2 = len(s2)
4 # Initialize a 2D list of sets to use as a dynamic programming table
5 dp = [[set() for _ in range(length_s2 + 1)] for _ in range(length_s1 + 1)]
6
7 # Initialize the first cell of the dynamic programming table
8 dp[0][0].add(0)
9
10 # Iterate through all positions in both strings
11 for i in range(length_s1 + 1):
12 for j in range(length_s2 + 1):
13 for delta in dp[i][j]:
14 # When s1 contains a number
15 number = 0
16 if delta <= 0:
17 for p in range(i, min(i + 3, length_s1)):
18 if s1[p].isdigit():
19 number = number * 10 + int(s1[p])
20 dp[p + 1][j].add(delta + number)
21 else:
22 # If it's not a digit, stop the sequence
23 break
24
25 # When s2 contains a number
26 number = 0
27 if delta >= 0:
28 for q in range(j, min(j + 3, length_s2)):
29 if s2[q].isdigit():
30 number = number * 10 + int(s2[q])
31 dp[i][q + 1].add(delta - number)
32 else:
33 # If it's not a digit, stop the sequence
34 break
35
36 # Assuming that the numbers might represent a single character
37 # in either s1 or s2 if the next char is a letter
38 if i < length_s1 and delta < 0 and not s1[i].isdigit():
39 dp[i + 1][j].add(delta + 1)
40 if j < length_s2 and delta > 0 and not s2[j].isdigit():
41 dp[i][j + 1].add(delta - 1)
42
43 # Check if current characters in both strings match
44 # and have equal length (delta is zero)
45 if i < length_s1 and j < length_s2 and delta == 0 and s1[i] == s2[j]:
46 dp[i + 1][j + 1].add(0)
47
48 # Return true if a match is found at the end of both strings
49 # with no remaining length difference
50 return 0 in dp[length_s1][length_s2]
51
52# Utility function to check if a character is a digit
53def is_digit(char: str) -> bool:
54 return char.isdigit()
55
1import java.util.Set;
2import java.util.HashSet;
3
4public class Solution {
5 // Function to determine if two strings possibly match, by accounting for numbers in strings that can represent variable lengths
6 public boolean possiblyEquals(String s1, String s2) {
7 int lengthS1 = s1.length();
8 int lengthS2 = s2.length();
9
10 // Create dynamic programming table, where each cell contains a set of integers representing possible length differences
11 Set<Integer>[][] dp = new HashSet[lengthS1 + 1][lengthS2 + 1];
12 for (int i = 0; i <= lengthS1; ++i) {
13 for (int j = 0; j <= lengthS2; ++j) {
14 dp[i][j] = new HashSet<>();
15 }
16 }
17
18 // Initialize the first cell of the dynamic programming table
19 dp[0][0].add(0);
20
21 // Iterate through all positions in both strings
22 for (int i = 0; i <= lengthS1; i++) {
23 for (int j = 0; j <= lengthS2; j++) {
24 // Explore all possible length differences for the current cell
25 for (Integer delta : dp[i][j]) {
26 // When s1 contains a number
27 int number = 0;
28 if (delta <= 0) {
29 for (int p = i; p < Math.min(i + 3, lengthS1); p++) {
30 if (Character.isDigit(s1.charAt(p))) {
31 number = number * 10 + Character.getNumericValue(s1.charAt(p));
32 dp[p + 1][j].add(delta + number);
33 } else {
34 // If it's not a digit, stop the sequence
35 break;
36 }
37 }
38 }
39
40 // When s2 contains a number
41 number = 0;
42 if (delta >= 0) {
43 for (int q = j; q < Math.min(j + 3, lengthS2); q++) {
44 if (Character.isDigit(s2.charAt(q))) {
45 number = number * 10 + Character.getNumericValue(s2.charAt(q));
46 dp[i][q + 1].add(delta - number);
47 } else {
48 // If it's not a digit, stop the sequence
49 break;
50 }
51 }
52 }
53
54 // If there is a number in s1 and the next character in s1 is a letter,
55 // it is assumed that the number in s1 might represent a single character
56 if (i < lengthS1 && delta < 0 && !Character.isDigit(s1.charAt(i))) {
57 dp[i + 1][j].add(delta + 1);
58 }
59
60 // If there is a number in s2 and the next character in s2 is a letter,
61 // it is assumed that the number in s2 might represent a single character
62 if (j < lengthS2 && delta > 0 && !Character.isDigit(s2.charAt(j))) {
63 dp[i][j + 1].add(delta - 1);
64 }
65
66 // Check if the current characters in both strings match and have equal length difference (delta is zero)
67 if (i < lengthS1 && j < lengthS2 && delta == 0 && s1.charAt(i) == s2.charAt(j)) {
68 dp[i + 1][j + 1].add(0);
69 }
70 }
71 }
72 }
73
74 // Return true if a match is found at the end of both strings with no remaining length difference
75 return dp[lengthS1][lengthS2].contains(0);
76 }
77
78 // Helper function to check if a character is a digit (actually not needed as Java has Character.isDigit)
79 private boolean isDigit(char ch) {
80 return Character.isDigit(ch);
81 }
82}
83
1#include <vector>
2#include <string>
3#include <set>
4#include <algorithm>
5#include <cctype>
6
7// Function to check if a given character is a digit
8bool isDigit(char ch) {
9 return std::isdigit(static_cast<unsigned char>(ch)) != 0;
10}
11
12// Function to check if two strings possibly equals with variable-length block representations
13bool possiblyEquals(const std::string& s1, const std::string& s2) {
14 int lengthS1 = s1.length(), lengthS2 = s2.length();
15 std::vector<std::vector<std::set<int>>> dp(lengthS1 + 1, std::vector<std::set<int>>(lengthS2 + 1));
16
17 // Initialize the first cell of the dynamic programming table
18 dp[0][0].insert(0);
19
20 // Iterate through all positions in both strings
21 for (int i = 0; i <= lengthS1; ++i) {
22 for (int j = 0; j <= lengthS2; ++j) {
23 for (int delta : dp[i][j]) {
24 // When s1 contains a number
25 int number = 0;
26 if (delta <= 0) {
27 for (int p = i; p < std::min(i + 3, lengthS1); ++p) {
28 if (isDigit(s1[p])) {
29 number = number * 10 + (s1[p] - '0');
30 dp[p + 1][j].insert(delta + number);
31 } else {
32 // If it's not a digit, stop the sequence
33 break;
34 }
35 }
36 }
37
38 // When s2 contains a number
39 number = 0;
40 if (delta >= 0) {
41 for (int q = j; q < std::min(j + 3, lengthS2); ++q) {
42 if (isDigit(s2[q])) {
43 number = number * 10 + (s2[q] - '0');
44 dp[i][q + 1].insert(delta - number);
45 } else {
46 // If it's not a digit, stop the sequence
47 break;
48 }
49 }
50 }
51
52 // If there is a number in s1 and the next character in s1 is a letter,
53 // it is assumed that the number in s1 might represent a single character
54 if (i < lengthS1 && delta < 0 && !isDigit(s1[i])) {
55 dp[i + 1][j].insert(delta + 1);
56 }
57
58 // If there is a number in s2 and the next character in s2 is a letter,
59 // it is assumed that the number in s2 might represent a single character
60 if (j < lengthS2 && delta > 0 && !isDigit(s2[j])) {
61 dp[i][j + 1].insert(delta - 1);
62 }
63
64 // Check if the current characters in both strings match and have equal length,
65 // only when delta is zero
66 if (i < lengthS1 && j < lengthS2 && delta == 0 && s1[i] == s2[j]) {
67 dp[i + 1][j + 1].insert(0);
68 }
69 }
70 }
71 }
72
73 // Return true if a match is found at the end of both strings with no remaining length difference
74 return dp[lengthS1][lengthS2].count(0) > 0;
75}
76
1function possiblyEquals(s1: string, s2: string): boolean {
2 const lengthS1 = s1.length,
3 lengthS2 = s2.length;
4 let dp: Array<Array<Set<number>>> = Array.from({ length: lengthS1 + 1 }, () =>
5 Array.from({ length: lengthS2 + 1 }, () => new Set<number>()),
6 );
7
8 // Initialize the first cell of the dynamic programming table
9 dp[0][0].add(0);
10
11 // Iterate through all positions in both strings
12 for (let i = 0; i <= lengthS1; i++) {
13 for (let j = 0; j <= lengthS2; j++) {
14 for (const delta of dp[i][j]) {
15 // When s1 contains a number
16 let number = 0;
17 if (delta <= 0) {
18 for (let p = i; p < Math.min(i + 3, lengthS1); p++) {
19 if (isDigit(s1[p])) {
20 number = number * 10 + Number(s1[p]);
21 dp[p + 1][j].add(delta + number);
22 } else {
23 // If it's not a digit, stop the sequence
24 break;
25 }
26 }
27 }
28
29 // When s2 contains a number
30 number = 0;
31 if (delta >= 0) {
32 for (let q = j; q < Math.min(j + 3, lengthS2); q++) {
33 if (isDigit(s2[q])) {
34 number = number * 10 + Number(s2[q]);
35 dp[i][q + 1].add(delta - number);
36 } else {
37 // If it's not a digit, stop the sequence
38 break;
39 }
40 }
41 }
42
43 // If there is a number in s1 and the next character in s1 is a letter,
44 // it is assumed that the number in s1 might represent a single character
45 if (i < lengthS1 && delta < 0 && !isDigit(s1[i])) {
46 dp[i + 1][j].add(delta + 1);
47 }
48
49 // If there is a number in s2 and the next character in s2 is a letter,
50 // it is assumed that the number in s2 might represent a single character
51 if (j < lengthS2 && delta > 0 && !isDigit(s2[j])) {
52 dp[i][j + 1].add(delta - 1);
53 }
54
55 // Check if the current characters in both strings match and have equal length,
56 // only when delta is zero
57 if (i < lengthS1 && j < lengthS2 && delta === 0 && s1[i] === s2[j]) {
58 dp[i + 1][j + 1].add(0);
59 }
60 }
61 }
62 }
63
64 // Return true if a match is found at the end of both strings with no remaining length difference
65 return dp[lengthS1][lengthS2].has(0);
66}
67
68// Helper function to check if a character is a digit
69function isDigit(char: string): boolean {
70 return /^\d$/g.test(char);
71}
72
Time and Space Complexity
Time Complexity
The time complexity of the given code can be analyzed as follows:
- The outer two loops run through the lengths of
s1
ands2
, respectively, giving us a base factor ofO(n * m)
, wheren
is the length ofs1
andm
is the length ofs2
. - Inside the nested loops, the code processes potential numeric representations in both
s1
ands2
up to three digits long. Since the length of numeric sequences is capped at 3, this contributes a constant factor, not depending onn
orm
. - For each
i
andj
, the setdp[i][j]
could in the worst case hold a number of elements proportional to the sum of lengthsn + m
, as it stores the differences between the "translated" lengths ofs1[i...]
ands2[j...]
. - The set operations such as
.add()
have a complexity ofO(1)
on average but could degenerate toO(k)
wherek
is the number of items already in the set in the worst-case scenario due to hash collisions.
Putting it all together, the time complexity can be estimated as O(n * m * (n + m))
.
Space Complexity
Analyzing the space complexity:
- The dynamic programming table
dp
is a 2D array of sets with dimensions(n + 1) * (m + 1)
. The space taken by this table is the dominant factor. - Each set within
dp[i][j]
can potentially store up ton + m
integers representing various differences between the lengths ofs1
ands2
.
Hence, the total space complexity is O(n * m * (n + m))
.
Learn more about how to find time and space complexity quickly using problem constraints.
What are the most two important steps in writing a depth first search function? (Select 2)
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!