Facebook Pixel

2408. Design SQL 🔒

MediumDesignArrayHash TableString
Leetcode Link

Problem Description

You need to implement a simplified SQL database system that manages multiple tables. Each table has a fixed number of columns and supports basic operations like inserting rows, deleting rows, selecting cells, and exporting data.

The system should work as follows:

Initial Setup:

  • You're given two arrays: names (table names) and columns (number of columns for each table)
  • The i-th table has name names[i] and contains columns[i] columns

Operations to Implement:

  1. Insert (ins): Add a new row to a specified table

    • Each row gets an auto-incremented ID (starting from 1)
    • The ID continues incrementing even if previous rows were deleted
    • Returns true if successful, false if the row length doesn't match the table's column count or if the table doesn't exist
  2. Remove (rmv): Delete a row from a table by its ID

    • Does nothing if the table doesn't exist or the row ID doesn't exist
    • Deleting a row doesn't affect the auto-increment counter
  3. Select (sel): Retrieve a specific cell value

    • Takes table name, row ID, and column ID (1-indexed)
    • Returns the cell value as a string
    • Returns "<null>" if the table, row, or column doesn't exist
  4. Export (exp): Export all rows from a table in CSV format

    • Returns an array of strings, each representing a row
    • Each row string contains the row ID followed by all column values, separated by commas
    • Returns an empty array if the table doesn't exist

Example: If you have a table "users" with 3 columns:

  • Insert ["Alice", "25", "NYC"] → gets ID 1
  • Insert ["Bob", "30", "LA"] → gets ID 2
  • Delete row 1
  • Insert ["Charlie", "35", "SF"] → gets ID 3 (not 1, because auto-increment continues)
  • Export returns: ["2,Bob,30,LA", "3,Charlie,35,SF"]

The implementation uses a hash table where keys are table names and values are lists of rows. The auto-increment logic needs to track the next available ID for each table separately.

Quick Interview Experience
Help others by sharing your interview experience
Have you seen this problem before?

Intuition

When we look at this problem, we need to simulate a basic database system with multiple tables. The key insight is that we need to track two main things for each table: the actual data rows and the auto-increment counter for IDs.

The most straightforward approach is to use a hash table (dictionary) where each table name maps to its data structure. But what should this data structure be?

Initially, we might think of using a simple list to store rows. However, this creates a problem with the delete operation. If we use the list index as the row ID, deleting a row would shift all subsequent rows, changing their IDs. This violates the requirement that row IDs should remain stable.

Instead, we need to separate the concept of row ID from the list index. We can store each row along with its ID. When we delete a row, we can either:

  1. Mark it as deleted without removing it from the list
  2. Use a dictionary mapping row IDs to row data

The second approach using a nested dictionary structure (tables[name][rowId] = row) is cleaner because:

  • Insertion is O(1) - just add with the next ID
  • Deletion is O(1) - just delete the key
  • Selection is O(1) - direct access by ID
  • Export requires iterating through all existing rows

For tracking the auto-increment counter, we need a separate counter for each table that keeps incrementing regardless of deletions. This ensures that even if we delete row 5, the next inserted row gets ID 6, not 5.

The column count validation during insertion is straightforward - we just need to store the expected column count for each table and compare it with len(row).

This leads us to maintain:

  • A dictionary of dictionaries for the actual data: tables[table_name][row_id] = row_data
  • A dictionary for auto-increment counters: next_id[table_name] = next_available_id
  • A dictionary for column counts: column_count[table_name] = expected_columns

Solution Approach

Based on the reference solution approach, we use a hash table to store the mapping between table names and their data rows. Let's walk through the implementation details:

Data Structure:

  • Use a defaultdict(list) called tables where each key is a table name and the value is a list storing all rows for that table
  • Additionally, we need to track:
    • The expected column count for each table
    • The next auto-increment ID for each table

Implementation of Each Operation:

  1. Constructor (__init__):

    def __init__(self, names, columns):
        self.tables = {}  # table_name -> {row_id -> row_data}
        self.next_id = {}  # table_name -> next_available_id
        self.column_count = {}  # table_name -> expected_columns
      
        for i in range(len(names)):
            self.tables[names[i]] = {}
            self.next_id[names[i]] = 1
            self.column_count[names[i]] = columns[i]
  2. Insert Operation (insertRow):

    • Check if the table exists and if the row length matches the expected column count
    • If valid, assign the current auto-increment ID to the row
    • Store the row with its ID as the key
    • Increment the next available ID for that table
    def insertRow(self, name, row):
        if name not in self.tables or len(row) != self.column_count[name]:
            return False
      
        row_id = self.next_id[name]
        self.tables[name][row_id] = row
        self.next_id[name] += 1
        return True
  3. Delete Operation (deleteRow):

    • Simply remove the row from the dictionary if it exists
    • No need to adjust other row IDs or the auto-increment counter
    def deleteRow(self, name, rowId):
        if name in self.tables and rowId in self.tables[name]:
            del self.tables[name][rowId]
  4. Select Operation (selectCell):

    • Check if the table exists, row exists, and column index is valid
    • Return the cell value (using 1-indexed column ID)
    • Return "<null>" for any invalid access
    def selectCell(self, name, rowId, columnId):
        if (name not in self.tables or 
            rowId not in self.tables[name] or 
            columnId < 1 or 
            columnId > len(self.tables[name][rowId])):
            return "<null>"
      
        return self.tables[name][rowId][columnId - 1]
  5. Export Operation (exp):

    • Iterate through all existing rows in the table
    • Format each row as CSV: "id,value1,value2,..."
    • Sort by row ID to maintain order
    def exp(self, name):
        if name not in self.tables:
            return []
      
        result = []
        for row_id in sorted(self.tables[name].keys()):
            row_str = str(row_id) + "," + ",".join(self.tables[name][row_id])
            result.append(row_str)
      
        return result

Time Complexity:

  • Insert: O(1)
  • Delete: O(1)
  • Select: O(1)
  • Export: O(n log n) where n is the number of rows (due to sorting)

Space Complexity:

  • O(m × n × c) where m is the number of tables, n is the average number of rows per table, and c is the average number of columns per row

Ready to land your dream job?

Unlock your dream job with a 5-minute evaluator for a personalized learning plan!

Start Evaluator

Example Walkthrough

Let's walk through a small example to illustrate how the solution works:

Initial Setup:

names = ["users", "products"]
columns = [3, 2]

This creates two tables:

  • "users" with 3 columns
  • "products" with 2 columns

Our data structures after initialization:

tables = {"users": {}, "products": {}}
next_id = {"users": 1, "products": 1}
column_count = {"users": 3, "products": 2}

Step 1: Insert into "users" table

insertRow("users", ["Alice", "25", "NYC"])
  • Check: "users" exists ✓, len(["Alice", "25", "NYC"]) = 3 = column_count["users"] ✓
  • Assign row_id = next_id["users"] = 1
  • Store: tables["users"][1] = ["Alice", "25", "NYC"]
  • Increment: next_id["users"] = 2
  • Return: true

State after insertion:

tables["users"] = {1: ["Alice", "25", "NYC"]}
next_id["users"] = 2

Step 2: Insert another row

insertRow("users", ["Bob", "30", "LA"])
  • Row gets ID 2
  • tables["users"][2] = ["Bob", "30", "LA"]
  • next_id["users"] = 3

Step 3: Delete row 1

deleteRow("users", 1)
  • Remove key 1 from tables["users"]
  • Note: next_id["users"] remains 3 (unchanged)

State after deletion:

tables["users"] = {2: ["Bob", "30", "LA"]}
next_id["users"] = 3  # Still 3!

Step 4: Insert after deletion

insertRow("users", ["Charlie", "35", "SF"])
  • Gets row_id = 3 (not 1, because auto-increment continues)
  • tables["users"][3] = ["Charlie", "35", "SF"]
  • next_id["users"] = 4

Step 5: Select a cell

selectCell("users", 2, 2)
  • Check: "users" exists ✓, row 2 exists ✓, column 2 ≤ 3 ✓
  • Return tables["users"][2][2-1] = tables["users"][2][1] = "30"

Step 6: Export the table

exp("users")
  • Iterate through sorted keys: [2, 3]
  • For row_id=2: "2,Bob,30,LA"
  • For row_id=3: "3,Charlie,35,SF"
  • Return: ["2,Bob,30,LA", "3,Charlie,35,SF"]

Step 7: Invalid operations

insertRow("users", ["Dan", "40"])  # Returns false (2 values, needs 3)
selectCell("users", 1, 1)          # Returns "<null>" (row 1 was deleted)
selectCell("inventory", 1, 1)      # Returns "<null>" (table doesn't exist)

This example demonstrates:

  1. Auto-increment IDs persist through deletions (row 3 comes after row 2, even though row 1 was deleted)
  2. Direct access to rows by ID using dictionary structure
  3. Validation of column counts during insertion
  4. Handling of invalid access attempts

Solution Implementation

1from collections import defaultdict
2from typing import List
3
4class SQL:
5    """
6    A simple SQL-like database implementation that supports basic table operations.
7    """
8  
9    def __init__(self, names: List[str], columns: List[int]):
10        """
11        Initialize the SQL database with table names and their column counts.
12      
13        Args:
14            names: List of table names to create
15            columns: List of column counts for each corresponding table
16        """
17        # Dictionary to store tables, where key is table name and value is list of rows
18        self.tables = defaultdict(list)
19      
20        # Store table metadata for potential future use
21        self.table_names = names
22        self.table_columns = columns
23  
24    def insertRow(self, name: str, row: List[str]) -> None:
25        """
26        Insert a new row into the specified table.
27      
28        Args:
29            name: The name of the table to insert into
30            row: List of values representing the row data
31        """
32        # Append the row to the table's list of rows
33        self.tables[name].append(row)
34  
35    def deleteRow(self, name: str, rowId: int) -> None:
36        """
37        Delete a row from the specified table by row ID.
38      
39        Args:
40            name: The name of the table to delete from
41            rowId: The 1-based index of the row to delete
42        """
43        # Check if table exists and row_id is valid
44        if name in self.tables and 0 < rowId <= len(self.tables[name]):
45            # Remove the row at the specified index (convert to 0-based index)
46            del self.tables[name][rowId - 1]
47  
48    def selectCell(self, name: str, rowId: int, columnId: int) -> str:
49        """
50        Select and return a specific cell value from a table.
51      
52        Args:
53            name: The name of the table to select from
54            rowId: The 1-based index of the row
55            columnId: The 1-based index of the column
56          
57        Returns:
58            The value at the specified cell as a string
59        """
60        # Convert 1-based indices to 0-based and return the cell value
61        return self.tables[name][rowId - 1][columnId - 1]
62
63
64# Your SQL object will be instantiated and called as such:
65# obj = SQL(names, columns)
66# obj.insertRow(name, row)
67# obj.deleteRow(name, rowId)
68# param_3 = obj.selectCell(name, rowId, columnId)
69
1class SQL {
2    // Map to store table name as key and list of rows as value
3    // Each row is represented as a List<String>
4    private Map<String, List<List<String>>> tables;
5
6    /**
7     * Constructor to initialize the SQL database
8     * @param names List of table names
9     * @param columns List of column counts for each table (currently unused)
10     */
11    public SQL(List<String> names, List<Integer> columns) {
12        // Initialize the HashMap with initial capacity based on number of tables
13        tables = new HashMap<>(names.size());
14    }
15
16    /**
17     * Insert a new row into the specified table
18     * @param name Name of the table
19     * @param row List of values representing the row to be inserted
20     */
21    public void insertRow(String name, List<String> row) {
22        // If table doesn't exist, create a new ArrayList for it
23        // Then add the row to the table
24        tables.computeIfAbsent(name, k -> new ArrayList<>()).add(row);
25    }
26
27    /**
28     * Delete a row from the specified table
29     * @param name Name of the table
30     * @param rowId ID of the row to delete (1-indexed)
31     */
32    public void deleteRow(String name, int rowId) {
33        // Implementation needed: Remove the row at index (rowId - 1)
34        // Since rowId is 1-indexed, we need to subtract 1 for 0-indexed list
35        if (tables.containsKey(name)) {
36            List<List<String>> tableRows = tables.get(name);
37            if (rowId > 0 && rowId <= tableRows.size()) {
38                tableRows.remove(rowId - 1);
39            }
40        }
41    }
42
43    /**
44     * Select and return a specific cell value from the table
45     * @param name Name of the table
46     * @param rowId ID of the row (1-indexed)
47     * @param columnId ID of the column (1-indexed)
48     * @return The value at the specified cell
49     */
50    public String selectCell(String name, int rowId, int columnId) {
51        // Convert 1-indexed rowId and columnId to 0-indexed for list access
52        return tables.get(name).get(rowId - 1).get(columnId - 1);
53    }
54}
55
56/**
57 * Your SQL object will be instantiated and called as such:
58 * SQL obj = new SQL(names, columns);
59 * obj.insertRow(name, row);
60 * obj.deleteRow(name, rowId);
61 * String param3 = obj.selectCell(name, rowId, columnId);
62 */
63
1class SQL {
2private:
3    // Store tables as a map: table name -> list of rows (each row is a vector of strings)
4    unordered_map<string, vector<vector<string>>> tables;
5  
6public:
7    /**
8     * Constructor to initialize the SQL database with table names and their column counts
9     * @param names - List of table names to create
10     * @param columns - List of column counts for each corresponding table
11     */
12    SQL(vector<string>& names, vector<int>& columns) {
13        // Initialize empty tables for each table name
14        for (int i = 0; i < names.size(); i++) {
15            tables[names[i]] = vector<vector<string>>();
16        }
17    }
18  
19    /**
20     * Insert a new row into the specified table
21     * @param name - Name of the table to insert into
22     * @param row - Vector of string values representing the row data
23     */
24    void insertRow(string name, vector<string> row) {
25        tables[name].push_back(row);
26    }
27  
28    /**
29     * Delete a row from the specified table by row ID
30     * @param name - Name of the table to delete from
31     * @param rowId - 1-indexed row ID to delete
32     */
33    void deleteRow(string name, int rowId) {
34        // Remove the row at index (rowId - 1) since rowId is 1-indexed
35        tables[name].erase(tables[name].begin() + (rowId - 1));
36    }
37  
38    /**
39     * Select and return a specific cell value from a table
40     * @param name - Name of the table to select from
41     * @param rowId - 1-indexed row ID
42     * @param columnId - 1-indexed column ID
43     * @return The string value at the specified cell
44     */
45    string selectCell(string name, int rowId, int columnId) {
46        // Convert 1-indexed IDs to 0-indexed for vector access
47        return tables[name][rowId - 1][columnId - 1];
48    }
49};
50
51/**
52 * Your SQL object will be instantiated and called as such:
53 * SQL* obj = new SQL(names, columns);
54 * obj->insertRow(name, row);
55 * obj->deleteRow(name, rowId);
56 * string param_3 = obj->selectCell(name, rowId, columnId);
57 */
58
1// Store tables as a map: table name -> list of rows (each row is an array of strings)
2let tables: Map<string, string[][]> = new Map();
3
4/**
5 * Initialize the SQL database with table names and their column counts
6 * @param names - List of table names to create
7 * @param columns - List of column counts for each corresponding table
8 */
9function SQL(names: string[], columns: number[]): void {
10    // Clear any existing tables
11    tables = new Map();
12  
13    // Initialize empty tables for each table name
14    for (let i = 0; i < names.length; i++) {
15        tables.set(names[i], []);
16    }
17}
18
19/**
20 * Insert a new row into the specified table
21 * @param name - Name of the table to insert into
22 * @param row - Array of string values representing the row data
23 */
24function insertRow(name: string, row: string[]): void {
25    // Get the table and add the new row
26    const table = tables.get(name);
27    if (table !== undefined) {
28        table.push(row);
29    }
30}
31
32/**
33 * Delete a row from the specified table by row ID
34 * @param name - Name of the table to delete from
35 * @param rowId - 1-indexed row ID to delete
36 */
37function deleteRow(name: string, rowId: number): void {
38    // Get the table and remove the row at index (rowId - 1) since rowId is 1-indexed
39    const table = tables.get(name);
40    if (table !== undefined) {
41        table.splice(rowId - 1, 1);
42    }
43}
44
45/**
46 * Select and return a specific cell value from a table
47 * @param name - Name of the table to select from
48 * @param rowId - 1-indexed row ID
49 * @param columnId - 1-indexed column ID
50 * @return The string value at the specified cell
51 */
52function selectCell(name: string, rowId: number, columnId: number): string {
53    // Get the table and return the value at the specified cell
54    // Convert 1-indexed IDs to 0-indexed for array access
55    const table = tables.get(name);
56    if (table !== undefined) {
57        return table[rowId - 1][columnId - 1];
58    }
59    return "";
60}
61

Time and Space Complexity

Time Complexity:

  • __init__: O(1) - Simply initializes an empty defaultdict
  • insertRow: O(1) - Appends a row to the list, which is a constant time operation
  • deleteRow: O(1) - Currently empty implementation, but would be O(1) if implemented as marking deletion without actual removal
  • selectCell: O(1) - Direct access to a list element by index

Overall, each operation has O(1) time complexity.

Space Complexity:

  • The space complexity is O(n) where n is the total number of elements stored across all tables
  • Each row inserted adds to the space usage proportionally
  • The defaultdict stores lists of rows, with total space proportional to the number of rows multiplied by the number of columns per row

Note: The deleteRow method is not implemented in the given code. If it were to be implemented using actual deletion from the list, it would have O(m) time complexity where m is the number of rows in the table due to list shifting. However, the reference answer suggests O(1), which implies a logical deletion approach (like marking as deleted) rather than physical removal.

Learn more about how to find time and space complexity quickly.

Common Pitfalls

1. Incorrect Auto-Increment Implementation

The provided code doesn't properly implement the auto-increment ID feature. It simply appends rows to a list, making row IDs position-based rather than persistent. When a row is deleted, all subsequent rows shift position, changing their IDs.

Problem Example:

sql = SQL(["users"], [3])
sql.insertRow("users", ["Alice", "25", "NYC"])  # Should be ID 1
sql.insertRow("users", ["Bob", "30", "LA"])     # Should be ID 2
sql.deleteRow("users", 1)                       # Delete Alice
sql.insertRow("users", ["Charlie", "35", "SF"]) # Should be ID 3, not 2
# Current code would make Charlie ID 2 (position-based)

2. Missing Validation in Insert Operation

The current insertRow method doesn't validate:

  • Whether the table exists
  • Whether the row has the correct number of columns
  • It should return True/False based on success

3. Missing Error Handling in Select Operation

The selectCell method doesn't handle invalid cases properly:

  • Non-existent table
  • Invalid row ID
  • Invalid column ID Should return "<null>" for any invalid access.

4. Missing Export Functionality

The code completely lacks the exp method required by the problem specification.

Corrected Solution

class SQL:
    def __init__(self, names: List[str], columns: List[int]):
        # Use dictionaries to store table data with persistent IDs
        self.tables = {}  # table_name -> {row_id -> row_data}
        self.next_id = {}  # table_name -> next_available_id
        self.column_count = {}  # table_name -> expected_columns
      
        for i in range(len(names)):
            self.tables[names[i]] = {}
            self.next_id[names[i]] = 1
            self.column_count[names[i]] = columns[i]
  
    def insertRow(self, name: str, row: List[str]) -> bool:
        # Validate table exists and column count matches
        if name not in self.tables or len(row) != self.column_count[name]:
            return False
      
        # Assign auto-incremented ID and store row
        row_id = self.next_id[name]
        self.tables[name][row_id] = row
        self.next_id[name] += 1
        return True
  
    def deleteRow(self, name: str, rowId: int) -> None:
        # Only delete if table and row exist
        if name in self.tables and rowId in self.tables[name]:
            del self.tables[name][rowId]
  
    def selectCell(self, name: str, rowId: int, columnId: int) -> str:
        # Comprehensive validation
        if (name not in self.tables or 
            rowId not in self.tables[name] or 
            columnId < 1 or 
            columnId > len(self.tables[name][rowId])):
            return "<null>"
      
        return self.tables[name][rowId][columnId - 1]
  
    def exp(self, name: str) -> List[str]:
        # Export table in CSV format
        if name not in self.tables:
            return []
      
        result = []
        # Sort by row ID to maintain order
        for row_id in sorted(self.tables[name].keys()):
            row_str = str(row_id) + "," + ",".join(self.tables[name][row_id])
            result.append(row_str)
      
        return result

Key Improvements:

  1. Uses dictionary to store rows with persistent IDs instead of position-based list
  2. Properly implements auto-increment that continues even after deletions
  3. Validates all inputs and returns appropriate values/errors
  4. Includes the missing exp method with proper CSV formatting
  5. Maintains row order by ID when exporting
Discover Your Strengths and Weaknesses: Take Our 5-Minute Quiz to Tailor Your Study Plan:

Which of the following is a good use case for backtracking?


Recommended Readings

Want a Structured Path to Master System Design Too? Don’t Miss This!

Load More