1548. The Most Similar Path in a Graph


Problem Description

In this problem, we're provided with a fictional map of n cities, connected by m bidirectional roads. Each city has a unique name represented by a three-letter uppercase English string. We can travel from any city to any other city via these roads because the map is described as a connected graph.

We are also given a targetPath, which is an array of city names that we aim to follow. Our goal is to find an actual path through the graph — starting and ending at any cities — that either exactly matches or is most similar to this targetPath. The 'similarity' here is measured by the edit distance, which counts how many names differ between the targetPath and the path we find.

Our task is to return the sequence of city indices that corresponds to the path with the minimum edit distance to the targetPath. If more than one path has the same minimum edit distance, we can return any one of them.

The challenge involves taking into consideration not just the sequence of city names but also the road connections — making sure each city in our path is connected to the next.

Intuition

To tackle this problem, dynamic programming provides a structured way to consider all possible paths and city combinations methodically. We use a dynamic programming table, f, where each element f[i][j] represents the minimum edit distance of the path from the start to city j (at index j) matching up to the i-th element in the targetPath. This essentially breaks down the larger problem into smaller subproblems: the edit distance of shorter prefixes of the target path.

Another array, pre, is used to store which city we came from when we arrived at a particular city with the minimum edit distance for a given prefix of the target path. This is essential for backtracking to find the actual path once we've computed all the minimum distances.

The solution iterates through the target path and for each city in the target path, it calculates the minimum edit distance for reaching every city in the graph, taking into account which city we could have come from (based on the graph connections). At the end of the iterative process, we identify the city from which we can reach the end of the target path with the minimum edit distance. Then we backtrack using the pre array to recover the actual path that gives us this minimum edit distance.

This dynamic programming approach ensures that we do not recompute the edit distance for the same subproblems, which significantly improves efficiency compared to a naive recursive approach.

Learn more about Graph and Dynamic Programming patterns.

Solution Approach

The solution to the problem utilizes dynamic programming, an optimization technique that solves complex problems by breaking them down into simpler subproblems. Here is a step-by-step explanation of the implementation using the concepts mentioned in the Reference Solution Approach:

Step 1: Graph Construction

  • Firstly, we build an adjacency list g based on the input list of roads. This list is used to quickly identify connected cities. Each entry g[i] contains a list of indices corresponding to cities directly reachable from city i.

Step 2: Initialization

  • Next, we prepare a 2D array f to keep track of the minimum edit distances. f[i][j] will hold the minimum edit distance after comparing i elements of targetPath to the path ending in city j.
  • We also create a 2D array pre to store the previous city for each state, which will allow us to backtrack the optimal path once we have the minimum edit distances.
  • Initially, we populate the first row of f by comparing the first city named in targetPath to each city name in names and recording the edit distance (0 if the names match, 1 if they differ).

Step 3: Dynamic Programming - Filling the Table

  • We then iterate through each element i of targetPath, and for each city j, we calculate the minimum edit distance by considering all cities k that are connected to j (k comes from the adjacency list of j, g[j]).
  • The edit distance f[i][j] is set to the minimum of the previously calculated edit distance f[i - 1][k] (the edit distance for one fewer city in targetPath and ending in a city connected to j) plus the cost of the current step (which is 0 if the city names match and 1 otherwise).
  • While computing f[i][j], we also update the pre[i][j] array to remember which preceding city k led to the minimum edit distance for city j.

Step 4: Backtracking

  • After filling in the dynamic programming table, the algorithm determines the city that would end the path with the minimum edit distance. This step involves a simple iteration over the last row of f to find the minimum value.
  • Starting from this city, the backtracking process is commenced using the pre array to reconstruct the most similar path. We step backward through pre from the last city, appending each city to the result path ans as we go.

Throughout steps 3 and 4, the algorithm uses a ternary operator (t := ...) as a way to calculate and directly use the potential new edit distance before comparing it with the current minimum. This allows for concise code that both computes the new distance and updates it conditionally.

This implementation leverages the adjacency list for efficient graph traversal, a dynamic programming matrix for optimal substructure and overlapping subproblems, and backtracking to construct the final solution path.

Ready to land your dream job?

Unlock your dream job with a 2-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's illustrate the solution approach using a small example:

Suppose we have 4 cities connected by roads and a targetPath we need to follow or find the most similar path to:

  • The cities are named ["ABC", "ACD", "BDC", "BCC"]
  • The roads connect the cities as follows: [[0, 1], [1, 2], [2, 3]] (city 0 connects to city 1, city 1 to city 2, and city 2 to city 3)
  • The targetPath is ["ABC", "BDC", "BCC"]

Step 1: Graph Construction

We create the adjacency list g:

g[0] = [1]
g[1] = [0, 2]
g[2] = [1, 3]
g[3] = [2]

Step 2: Initialization

We prepare f and pre with dimensions [targetPath.length][cities.length] and initialize f[0] as follows (0 for a match and 1 for no match):

f[0] = [0, 1, 1, 1]

pre doesn't need initialization as it will be filled during dynamic programming.

Step 3: Dynamic Programming - Filling the Table

We fill the table by iterating over each targetPath elements and cities:

Iteration for targetPath[1] = "BDC":
For city 0 ("ABC"): f[1][0] = min(f[0][1] + 1) = min(2) = 2
For city 1 ("ACD"): f[1][1] = min(f[0][0] + 1, f[0][2] + 1) = min(1, 2) = 1
For city 2 ("BDC"): f[1][2] = min(f[0][1] + 0) = min(1) = 1
For city 3 ("BCC"): f[1][3] = min(f[0][2] + 1) = min(2) = 2

Iteration for targetPath[2] = "BCC":
For city 0 ("ABC"): f[2][0] = min(f[1][1] + 1) = min(2) = 2
For city 1 ("ACD"): f[2][1] = min(f[1][0] + 1, f[1][2] + 1) = min(3, 2) = 2
For city 2 ("BDC"): f[2][2] = min(f[1][1] + 1) = min(2) = 2
For city 3 ("BCC"): f[2][3] = min(f[1][2] + 0) = min(1) = 1

During calculations, we update pre to remember the path. For simplicity, the example shows values just for f.

Step 4: Backtracking

The last row of f is [2, 2, 2, 1], so the minimum edit distance is 1, ending with city 3 ("BCC"). We backtrack to find the most similar path:

Most similar path: [2 (from "BDC"), 3 ("BCC")]

Note: In a full implementation, pre would record the previous node, and we would backtrack through it to construct the path, likely leading to a longer path that then includes the starting city and any intermediate cities in our most similar path.

This example demonstrates the dynamic programming approach, how we calculate the minimum edit distances, and how we backtrack from the minimum edit distance value to find the path with the closest similarity to the targetPath.

Solution Implementation

1from typing import List
2from math import inf
3
4class Solution:
5    def mostSimilar(self, 
6                    num_cities: int, 
7                    roads: List[List[int]], 
8                    city_names: List[str], 
9                    target_path: List[str]) -> List[int]:
10        # Initialize an adjacency list for graph representation of roads.
11        graph = [[] for _ in range(num_cities)]
12        for city1, city2 in roads:
13            graph[city1].append(city2)
14            graph[city2].append(city1)
15      
16        # Number of cities in the target path.
17        path_length = len(target_path)
18      
19        # Initialize the DP table for minimum edit distance and predecessor table.
20        dp = [[inf] * num_cities for _ in range(path_length)]
21        predecessors = [[-1] * num_cities for _ in range(path_length)]
22      
23        # Fill the first row of DP table where only the first city of target path is considered.
24        for city_idx, city_name in enumerate(city_names):
25            dp[0][city_idx] = (target_path[0] != city_name)
26      
27        # Populate the DP table.
28        for i in range(1, path_length):
29            for current_city in range(num_cities):
30                for neighbor in graph[current_city]:
31                    cost = dp[i - 1][neighbor] + (target_path[i] != city_names[current_city])
32                    if cost < dp[i][current_city]:
33                        dp[i][current_city] = cost
34                        predecessors[i][current_city] = neighbor
35      
36        # Find the city that forms the end of the path with the minimum difference.
37        min_difference = inf
38        end_city = 0
39        for city_idx in range(num_cities):
40            if dp[-1][city_idx] < min_difference:
41                min_difference = dp[-1][city_idx]
42                end_city = city_idx
43      
44        # Reconstruct the path starting from the end city.
45        result_path = [0] * path_length
46        for i in range(path_length - 1, -1, -1):
47            result_path[i] = end_city
48            end_city = predecessors[i][end_city]
49      
50        # Return the reconstructed path.
51        return result_path
52
1import java.util.ArrayList;
2import java.util.Arrays;
3import java.util.Collections;
4import java.util.List;
5
6class Solution {
7    public List<Integer> mostSimilar(int numCities, int[][] roads, String[] cityNames, String[] targetPath) {
8        // Graph representation: g stores adjacency list of cities
9        List<Integer>[] adjacencyList = new List[numCities];
10        Arrays.setAll(adjacencyList, i -> new ArrayList<>());
11        // Build the graph using roads input
12        for (int[] road : roads) {
13            int cityA = road[0], cityB = road[1];
14            adjacencyList[cityA].add(cityB);
15            adjacencyList[cityB].add(cityA);
16        }
17
18        // Problem dimensions
19        int pathLength = targetPath.length;
20      
21        // Initialize DP arrays
22        final int INFINITY = 1 << 30; // A large number to simulate infinity
23        int[][] dp = new int[pathLength][numCities]; // DP array for storing minimum edit distance
24        int[][] previous = new int[pathLength][numCities]; // Stores predecessor for path reconstruction
25      
26        // Initialize dp and previous arrays with INFINITY and -1 respectively
27        for (int i = 0; i < pathLength; i++) {
28            Arrays.fill(dp[i], INFINITY);
29            Arrays.fill(previous[i], -1);
30        }
31
32        // Base case for DP array: compare the first city name in targetPath with all cityNames
33        for (int j = 0; j < numCities; ++j) {
34            dp[0][j] = targetPath[0].equals(cityNames[j]) ? 0 : 1;
35        }
36
37        // Fill in the DP array
38        for (int i = 1; i < pathLength; ++i) {
39            for (int j = 0; j < numCities; ++j) {
40                for (int k : adjacencyList[j]) {
41                    int cost = dp[i - 1][k] + (targetPath[i].equals(cityNames[j]) ? 0 : 1);
42                    if (cost < dp[i][j]) {
43                        dp[i][j] = cost;
44                        previous[i][j] = k;
45                    }
46                }
47            }
48        }
49
50        // Reconstruct the path with the minimum edit distance
51        int minDistance = INFINITY, endingCity = 0;
52        for (int j = 0; j < numCities; ++j) {
53            if (dp[pathLength - 1][j] < minDistance) {
54                minDistance = dp[pathLength - 1][j];
55                endingCity = j;
56            }
57        }
58
59        // Backtrack to find the most similar path
60        List<Integer> mostSimilarPath = new ArrayList<>();
61        for (int i = pathLength - 1; i >= 0; --i) {
62            mostSimilarPath.add(endingCity);
63            endingCity = previous[i][endingCity];
64        }
65
66        // Reverse the mostSimilarPath to get the correct order since we backtracked
67        Collections.reverse(mostSimilarPath);
68        return mostSimilarPath;
69    }
70}
71
1#include <vector>
2#include <string>
3#include <cstring> // To use memset
4
5using std::vector;
6using std::string;
7
8class Solution {
9public:
10    vector<int> mostSimilar(int numCities, vector<vector<int>>& roads, vector<string>& cityNames, vector<string>& targetPath) {
11        // Adjacency list representation of the graph
12        vector<int> graph[numCities];
13        for (auto& road : roads) {
14            int from = road[0], to = road[1];
15            graph[from].push_back(to);
16            graph[to].push_back(from);
17        }
18
19        int pathLength = targetPath.size();
20        int dp[pathLength][numCities];
21        int predecessors[pathLength][numCities];
22
23        // Initialize the dp and predecessors arrays with maximum value and -1 respectively
24        memset(dp, 0x3f, sizeof(dp));
25        memset(predecessors, -1, sizeof(predecessors));
26
27        // Base cases for the first city comparison in the target path
28        for (int j = 0; j < numCities; ++j) {
29            dp[0][j] = targetPath[0] != cityNames[j];
30        }
31
32        // Build the dp table
33        for (int i = 1; i < pathLength; ++i) {
34            for (int j = 0; j < numCities; ++j) {
35                for (int neighbor : graph[j]) {
36                    int cost = dp[i - 1][neighbor] + (targetPath[i] != cityNames[j]);
37                    if (cost < dp[i][j]) {
38                        dp[i][j] = cost;
39                        predecessors[i][j] = neighbor;
40                    }
41                }
42            }
43        }
44
45        // Find the city with the minimum cost for the last step in targetPath
46        int minIndex = 0;
47        int minCost = std::numeric_limits<int>::max();
48        for (int j = 0; j < numCities; ++j) {
49            if (dp[pathLength - 1][j] < minCost) {
50                minCost = dp[pathLength - 1][j];
51                minIndex = j;
52            }
53        }
54
55        // Reconstruct the path from the dp table
56        vector<int> answer(pathLength);
57        for (int i = pathLength - 1; i >= 0; --i) {
58            answer[i] = minIndex;
59            minIndex = predecessors[i][minIndex];
60        }
61
62        return answer;
63    }
64};
65
1// This function finds the path in the graph of cities that most closely matches a given target path.
2function mostSimilar(
3    cityCount: number,         // Total number of cities
4    roads: number[][],         // Connections between cities
5    cityNames: string[],       // Names of the cities
6    targetPath: string[],      // Target path to compare with
7): number[] {
8    // Graph representation where each city index contains its connected cities.
9    const graph: number[][] = Array.from({ length: cityCount }, () => []);
10
11    // Constructing the graph from provided roads data.
12    for (const [city1, city2] of roads) {
13        graph[city1].push(city2);
14        graph[city2].push(city1);
15    }
16
17    // Length of the targetPath.
18    const targetPathLength = targetPath.length;
19
20    // f is used for dynamic programming; it stores the minimum difference at each step.
21    const differences = Array.from({ length: targetPathLength }, () => Array.from({ length: cityCount }, () => Infinity));
22
23    // Predecessors array to rebuild the path with the minimum difference.
24    const predecessors: number[][] = Array.from({ length: targetPathLength }, () => Array.from({ length: cityCount }, () => -1));
25
26    // Initialize the differences for the first city in the target path.
27    for (let j = 0; j < cityCount; ++j) {
28        differences[0][j] = cityNames[j] === targetPath[0] ? 0 : 1;
29    }
30
31    // Fill the differences array by comparing each city name with the target path name at each step.
32    for (let i = 1; i < targetPathLength; ++i) {
33        for (let j = 0; j < cityCount; ++j) {
34            for (const neighbor of graph[j]) {
35                const comparisonCost = differences[i - 1][neighbor] + (cityNames[j] === targetPath[i] ? 0 : 1);
36                if (comparisonCost < differences[i][j]) {
37                    differences[i][j] = comparisonCost;
38                    predecessors[i][j] = neighbor;
39                }
40            }
41        }
42    }
43
44    // Find the ending city of the path with the minimum difference.
45    let bestCity = 0;
46    let minDifference = Infinity;
47    for (let j = 0; j < cityCount; ++j) {
48        if (differences[targetPathLength - 1][j] < minDifference) {
49            minDifference = differences[targetPathLength - 1][j];
50            bestCity = j;
51        }
52    }
53
54    // Backtracking to build the path by using the predecessors array.
55    const closestPath: number[] = Array(targetPathLength).fill(0);
56    for (let i = targetPathLength - 1; i >= 0; --i) {
57        closestPath[i] = bestCity;
58        bestCity = predecessors[i][bestCity];
59    }
60
61    // Return the path that is most similar to the target path.
62    return closestPath;
63}
64

Time and Space Complexity

Time Complexity

The time complexity of the code is primarily determined by the three nested loops:

  1. The first loop runs over the length of targetPath, which is m.
  2. The second loop runs over all nodes, which is n.
  3. The inside loop runs over the adjacent nodes for each node, which in the worst case could be n - 1 (since it's an undirected graph, and a node could potentially be connected to all other nodes).

This results in a time complexity of O(m * n * n) or simplified as O(m * n^2).

Space Complexity

The space complexity is influenced by:

  1. The adjacency list for the graph g, which in the worst case, if every node is connected to every other node, would take up O(n^2) space.
  2. The DP matrix f, which has m lists each of size n, thus taking O(m * n) space.
  3. The predecessor matrix pre, which also has m lists each of size n, so it also takes O(m * n) space.
  4. The final ans array, with a size of m, has a negligible impact compared to the rest, contributing O(m) space.

Considering all these, the space complexity of the code comes down to O(m * n) since O(n^2) could be larger than O(m * n), but it is not accounted for more than once.

Learn more about how to find time and space complexity quickly using problem constraints.


Discover Your Strengths and Weaknesses: Take Our 2-Minute Quiz to Tailor Your Study Plan:
Question 1 out of 10

Which of the following shows the order of node visit in a Breadth-first Search?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!