535. Encode and Decode TinyURL
Problem Description
This problem asks you to design a URL shortening service similar to TinyURL. You need to create a system that can convert long URLs into short URLs and then convert those short URLs back to their original long URLs.
The task requires implementing a Codec
class with the following functionality:
- Constructor (
__init__
): Initializes the URL shortening system encode(longUrl)
: Takes a long URL as input (likehttps://leetcode.com/problems/design-tinyurl
) and returns a shortened version (likehttp://tinyurl.com/4e9iAk
)decode(shortUrl)
: Takes a shortened URL and returns the original long URL
The key requirements are:
- Every long URL should be encoded to a unique short URL
- Every short URL should decode back to its original long URL
- The same object instance will be used for both encoding and decoding operations
- There are no restrictions on the specific algorithm you use for encoding/decoding
The solution uses a simple counter-based approach:
- Maintains a dictionary (
m
) to store mappings between IDs and long URLs - Uses an incrementing counter (
idx
) to generate unique IDs for each new URL - Creates short URLs by appending the ID to a base domain (
https://tinyurl.com/
) - When decoding, extracts the ID from the short URL and looks up the original URL in the dictionary
For example:
- Input:
encode("https://leetcode.com/problems/design-tinyurl")
- Output:
"https://tinyurl.com/1"
(for the first URL) - Input:
decode("https://tinyurl.com/1")
- Output:
"https://leetcode.com/problems/design-tinyurl"
Intuition
The core challenge in URL shortening is creating a bidirectional mapping between long URLs and short identifiers. We need a way to generate unique short codes and remember which long URL each code corresponds to.
The simplest approach is to think of this as a database problem. When we receive a long URL, we assign it a unique ID and store the mapping. This leads us to consider using a counter that increments with each new URL - it's guaranteed to be unique and simple to implement.
Why use a counter instead of other approaches like hashing? Hashing might seem natural, but it has complications:
- Hash collisions would require handling
- We'd need to store the hash-to-URL mapping anyway
- The hash output might be longer than necessary
A sequential counter gives us the shortest possible unique identifiers: 1
, 2
, 3
, etc. This minimizes the length of our short URLs.
For storage, we need a data structure that allows fast lookups when decoding. A dictionary (hash map) is perfect - it gives us O(1)
lookup time. We use the counter value as the key and the long URL as the value.
The encoding process becomes:
- Increment the counter to get a new unique ID
- Store the ID-to-URL mapping in our dictionary
- Return the short URL by appending the ID to our base domain
The decoding process is even simpler:
- Extract the ID from the short URL (the part after the last
/
) - Look up the ID in our dictionary to get the original URL
This approach trades simplicity for features. While more sophisticated solutions might handle custom aliases, analytics, or URL deduplication, this counter-based method solves the core problem with minimal complexity and guaranteed correctness.
Solution Approach
The implementation uses a counter-based approach with a hash map for storage. Let's walk through each component:
Data Structure Setup:
self.m = defaultdict()
: A dictionary to store the mapping between IDs and long URLsself.idx = 0
: A counter that starts at 0 and increments for each new URLself.domain = 'https://tinyurl.com/'
: The base domain for all shortened URLs
Encoding Algorithm:
def encode(self, longUrl: str) -> str:
self.idx += 1
self.m[str(self.idx)] = longUrl
return f'{self.domain}{self.idx}'
The encoding process:
- Increment the counter
self.idx
by 1 to generate a new unique ID - Store the mapping in the dictionary with the ID (converted to string) as the key and the long URL as the value
- Construct the short URL by concatenating the domain with the ID using an f-string
- Return the complete short URL (e.g.,
'https://tinyurl.com/1'
)
Decoding Algorithm:
def decode(self, shortUrl: str) -> str:
idx = shortUrl.split('/')[-1]
return self.m[idx]
The decoding process:
- Extract the ID from the short URL by splitting on
/
and taking the last element- For example:
'https://tinyurl.com/1'.split('/')
gives['https:', '', 'tinyurl.com', '1']
- Taking
[-1]
gives us'1'
- For example:
- Use the extracted ID to look up the original URL in the dictionary
- Return the original long URL
Time and Space Complexity:
encode()
: O(1) time - incrementing counter and dictionary insertion are constant time operationsdecode()
: O(1) time - string splitting is O(k) where k is the number of/
characters (constant for our URLs), and dictionary lookup is O(1)- Space: O(n) where n is the number of unique URLs encoded, as we store each mapping in the dictionary
This solution guarantees no collisions since each URL gets a unique, monotonically increasing ID. The trade-off is that the short URLs are predictable and sequential, which might not be desirable in a real-world system where security or unpredictability is important.
Ready to land your dream job?
Unlock your dream job with a 5-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's walk through encoding and decoding two URLs to see how the counter-based approach works:
Initial State:
idx = 0
m = {}
(empty dictionary)domain = 'https://tinyurl.com/'
Step 1: Encode first URL
- Input:
encode("https://leetcode.com/problems/design-tinyurl")
- Process:
- Increment
idx
:0 → 1
- Store in dictionary:
m['1'] = "https://leetcode.com/problems/design-tinyurl"
- Create short URL:
'https://tinyurl.com/' + '1'
- Increment
- Output:
"https://tinyurl.com/1"
- State after:
idx = 1
,m = {'1': 'https://leetcode.com/problems/design-tinyurl'}
Step 2: Encode second URL
- Input:
encode("https://example.com/very/long/path")
- Process:
- Increment
idx
:1 → 2
- Store in dictionary:
m['2'] = "https://example.com/very/long/path"
- Create short URL:
'https://tinyurl.com/' + '2'
- Increment
- Output:
"https://tinyurl.com/2"
- State after:
idx = 2
,m = {'1': 'https://leetcode.com/problems/design-tinyurl', '2': 'https://example.com/very/long/path'}
Step 3: Decode first short URL
- Input:
decode("https://tinyurl.com/1")
- Process:
- Split URL:
"https://tinyurl.com/1".split('/')
→['https:', '', 'tinyurl.com', '1']
- Extract ID: Take last element →
'1'
- Lookup in dictionary:
m['1']
→"https://leetcode.com/problems/design-tinyurl"
- Split URL:
- Output:
"https://leetcode.com/problems/design-tinyurl"
Step 4: Decode second short URL
- Input:
decode("https://tinyurl.com/2")
- Process:
- Split URL and get last element →
'2'
- Lookup:
m['2']
→"https://example.com/very/long/path"
- Split URL and get last element →
- Output:
"https://example.com/very/long/path"
The counter ensures each URL gets a unique ID (1, 2, 3...), and the dictionary provides instant lookup for decoding.
Solution Implementation
1from collections import defaultdict
2
3class Codec:
4 def __init__(self):
5 # Dictionary to store mapping from short URL ID to long URL
6 self.url_map = defaultdict(str)
7 # Counter for generating unique IDs for each URL
8 self.counter = 0
9 # Base domain for the shortened URLs
10 self.base_domain = 'https://tinyurl.com/'
11
12 def encode(self, longUrl: str) -> str:
13 """Encodes a URL to a shortened URL.
14
15 Args:
16 longUrl: The original long URL to be shortened
17
18 Returns:
19 A shortened URL with format: https://tinyurl.com/{id}
20 """
21 # Increment counter to generate a unique ID
22 self.counter += 1
23 # Store the mapping from ID to long URL
24 self.url_map[str(self.counter)] = longUrl
25 # Return the shortened URL
26 return f'{self.base_domain}{self.counter}'
27
28 def decode(self, shortUrl: str) -> str:
29 """Decodes a shortened URL to its original URL.
30
31 Args:
32 shortUrl: The shortened URL to be decoded
33
34 Returns:
35 The original long URL
36 """
37 # Extract the ID from the shortened URL (last part after '/')
38 url_id = shortUrl.split('/')[-1]
39 # Look up and return the original URL using the ID
40 return self.url_map[url_id]
41
42
43# Your Codec object will be instantiated and called as such:
44# codec = Codec()
45# codec.decode(codec.encode(url))
46
1public class Codec {
2 // HashMap to store the mapping between shortened key and original URL
3 private Map<String, String> urlMapping = new HashMap<>();
4
5 // Counter to generate unique IDs for each URL
6 private int counter = 0;
7
8 // Base domain for the shortened URL
9 private static final String BASE_DOMAIN = "https://tinyurl.com/";
10
11 /**
12 * Encodes a long URL to a shortened URL.
13 * @param longUrl the original URL to be shortened
14 * @return the shortened URL with unique identifier
15 */
16 public String encode(String longUrl) {
17 // Generate unique ID by incrementing counter
18 String uniqueId = String.valueOf(++counter);
19
20 // Store the mapping of ID to original URL
21 urlMapping.put(uniqueId, longUrl);
22
23 // Return the complete shortened URL
24 return BASE_DOMAIN + uniqueId;
25 }
26
27 /**
28 * Decodes a shortened URL to its original URL.
29 * @param shortUrl the shortened URL to decode
30 * @return the original long URL
31 */
32 public String decode(String shortUrl) {
33 // Find the position after the last '/' to extract the unique ID
34 int lastSlashIndex = shortUrl.lastIndexOf('/') + 1;
35
36 // Extract the unique ID from the shortened URL
37 String uniqueId = shortUrl.substring(lastSlashIndex);
38
39 // Retrieve and return the original URL using the ID
40 return urlMapping.get(uniqueId);
41 }
42}
43
44// Your Codec object will be instantiated and called as such:
45// Codec codec = new Codec();
46// codec.decode(codec.encode(url));
47
1class Solution {
2public:
3 /**
4 * Encodes a long URL to a shortened URL.
5 * @param longUrl The original URL to be shortened
6 * @return The shortened URL with format "https://tinyurl.com/{id}"
7 */
8 string encode(string longUrl) {
9 // Generate a unique ID by incrementing the counter
10 string uniqueId = to_string(++idCounter);
11
12 // Store the mapping from ID to original URL
13 idToUrlMap[uniqueId] = longUrl;
14
15 // Return the shortened URL by concatenating domain with ID
16 return DOMAIN_PREFIX + uniqueId;
17 }
18
19 /**
20 * Decodes a shortened URL to its original URL.
21 * @param shortUrl The shortened URL to be decoded
22 * @return The original long URL
23 */
24 string decode(string shortUrl) {
25 // Find the position after the last '/' to extract the ID
26 int idStartPos = shortUrl.rfind('/') + 1;
27
28 // Extract the ID from the shortened URL
29 string uniqueId = shortUrl.substr(idStartPos, shortUrl.size() - idStartPos);
30
31 // Look up and return the original URL using the ID
32 return idToUrlMap[uniqueId];
33 }
34
35private:
36 // Hash map to store the mapping from unique ID to original URL
37 unordered_map<string, string> idToUrlMap;
38
39 // Counter to generate unique IDs for each URL
40 int idCounter = 0;
41
42 // Base domain prefix for shortened URLs
43 const string DOMAIN_PREFIX = "https://tinyurl.com/";
44};
45
46// Your Solution object will be instantiated and called as such:
47// Solution solution;
48// solution.decode(solution.encode(url));
49
1// Base domain prefix for shortened URLs
2const DOMAIN_PREFIX = "https://tinyurl.com/";
3
4// Hash map to store the mapping from unique ID to original URL
5const idToUrlMap: Map<string, string> = new Map();
6
7// Counter to generate unique IDs for each URL
8let idCounter = 0;
9
10/**
11 * Encodes a long URL to a shortened URL.
12 * @param longUrl - The original URL to be shortened
13 * @returns The shortened URL with format "https://tinyurl.com/{id}"
14 */
15function encode(longUrl: string): string {
16 // Generate a unique ID by incrementing the counter
17 idCounter++;
18 const uniqueId = idCounter.toString();
19
20 // Store the mapping from ID to original URL
21 idToUrlMap.set(uniqueId, longUrl);
22
23 // Return the shortened URL by concatenating domain with ID
24 return DOMAIN_PREFIX + uniqueId;
25}
26
27/**
28 * Decodes a shortened URL to its original URL.
29 * @param shortUrl - The shortened URL to be decoded
30 * @returns The original long URL
31 */
32function decode(shortUrl: string): string {
33 // Find the position after the last '/' to extract the ID
34 const idStartPos = shortUrl.lastIndexOf('/') + 1;
35
36 // Extract the ID from the shortened URL
37 const uniqueId = shortUrl.substring(idStartPos);
38
39 // Look up and return the original URL using the ID
40 return idToUrlMap.get(uniqueId) || "";
41}
42
Time and Space Complexity
Time Complexity:
encode()
:O(1)
- The method performs constant time operations including incrementing a counter, storing a key-value pair in a dictionary, and string formatting.decode()
:O(1)
- The method performs a string split operation which takesO(n)
where n is the length of the URL, but since the URL length is bounded and relatively small, we can consider itO(1)
. Dictionary lookup is alsoO(1)
on average.
Space Complexity:
- Overall:
O(n)
wheren
is the number of unique URLs encoded. - The dictionary
self.m
stores a mapping for each unique long URL that has been encoded, growing linearly with the number of URLs. - Each
encode()
call adds one entry to the dictionary, consumingO(1)
additional space per call. - The
decode()
method doesn't use any additional space beyond temporary variables, so it'sO(1)
for the operation itself.
Learn more about how to find time and space complexity quickly.
Common Pitfalls
1. URL Collision in High-Traffic Scenarios
The counter-based approach is not thread-safe. In a multi-threaded environment or distributed system, multiple requests could read the same counter value before incrementing, causing collisions where different long URLs get mapped to the same short URL.
Solution:
import threading
class Codec:
def __init__(self):
self.url_map = defaultdict(str)
self.counter = 0
self.base_domain = 'https://tinyurl.com/'
# Add a lock for thread safety
self.lock = threading.Lock()
def encode(self, longUrl: str) -> str:
with self.lock:
self.counter += 1
current_id = self.counter
self.url_map[str(current_id)] = longUrl
return f'{self.base_domain}{current_id}'
2. Predictable URL Pattern Security Risk
Sequential IDs (1, 2, 3...) make it trivial for malicious users to enumerate all shortened URLs by simply incrementing the ID, potentially exposing private or sensitive URLs.
Solution:
import random
import string
class Codec:
def __init__(self):
self.url_to_code = {}
self.code_to_url = {}
self.base_domain = 'https://tinyurl.com/'
def encode(self, longUrl: str) -> str:
# Check if URL was already encoded
if longUrl in self.url_to_code:
return f'{self.base_domain}{self.url_to_code[longUrl]}'
# Generate random 6-character code
while True:
code = ''.join(random.choices(string.ascii_letters + string.digits, k=6))
if code not in self.code_to_url:
break
self.url_to_code[longUrl] = code
self.code_to_url[code] = longUrl
return f'{self.base_domain}{code}'
3. Missing Duplicate URL Handling
The current implementation creates a new short URL every time the same long URL is encoded, wasting storage and creating multiple short URLs for the same resource.
Solution:
class Codec:
def __init__(self):
self.url_map = defaultdict(str)
self.reverse_map = {} # Map from long URL to short URL ID
self.counter = 0
self.base_domain = 'https://tinyurl.com/'
def encode(self, longUrl: str) -> str:
# Check if this URL was already encoded
if longUrl in self.reverse_map:
return f'{self.base_domain}{self.reverse_map[longUrl]}'
self.counter += 1
self.url_map[str(self.counter)] = longUrl
self.reverse_map[longUrl] = str(self.counter)
return f'{self.base_domain}{self.counter}'
4. No Input Validation
The code doesn't validate if the input URLs are properly formatted or if the short URL in decode() actually exists in the system.
Solution:
class Codec:
def decode(self, shortUrl: str) -> str:
# Validate short URL format
if not shortUrl.startswith(self.base_domain):
raise ValueError(f"Invalid short URL format: {shortUrl}")
url_id = shortUrl.split('/')[-1]
# Check if ID exists in our system
if url_id not in self.url_map:
raise KeyError(f"Short URL not found: {shortUrl}")
return self.url_map[url_id]
5. Memory Limitations for Large Scale
Using a simple in-memory dictionary means all mappings are lost if the service restarts, and memory usage grows unbounded with the number of URLs.
Solution: Consider using a persistent database (Redis, PostgreSQL) or implementing an LRU cache with database backing:
from functools import lru_cache
class Codec:
def __init__(self, max_cache_size=10000):
self.cache = {} # In production, use Redis or a database
self.counter = self._load_counter() # Load from persistent storage
@lru_cache(maxsize=1000)
def decode(self, shortUrl: str) -> str:
url_id = shortUrl.split('/')[-1]
# In production, fetch from database if not in cache
return self._fetch_from_storage(url_id)
What are the most two important steps in writing a depth first search function? (Select 2)
Recommended Readings
Coding Interview Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https assets algo monster recursion jpg You first call Ben and ask
Runtime Overview When learning about algorithms and data structures you'll frequently encounter the term time complexity This concept is fundamental in computer science and offers insights into how long an algorithm takes to complete given a certain input size What is Time Complexity Time complexity represents the amount of time
Want a Structured Path to Master System Design Too? Don’t Miss This!