474. Ones and Zeroes
Problem Description
In this problem from LeetCode, we are tasked with finding the largest subset of binary strings from a given array strs
. A binary string consists only of '0's and '1's. We also have two integer constraints, m
and n
, which are the maximum number of '0's and '1's allowed in our subset, respectively.
The key part to understand here is:
- A subset means every element in our selected group of strings must also exist in the original group (
strs
), but not necessarily all elements instrs
have to be in our subset. - When creating the largest subset, the total count of '0's in all the strings of our subset cannot exceed
m
, and the total count of '1's cannot exceedn
.
Our goal is to identify the size of this largest subset. Size means the number of strings within it, not the length of each string.
Intuition
To solve this problem, we use a dynamic programming approach.
-
We start by understanding that the choice of including each string can affect our ability to include other strings. If we include a string with many '0's and '1's, it may prevent us from adding additional strings later. Thus, we have to make careful choices.
-
Generally, in dynamic programming, we try to solve smaller subproblems and use their results to construct solutions for larger subproblems. This concept is called optimal substructure.
-
Here, we define a two-dimensional array
f
wheref[i][j]
represents the size of the largest subset we can form with at mosti
'0's andj
'1's. The values ofi
range from 0 tom
, and the values ofj
range from 0 ton
, representing all possible constraints we might encounter. -
We initialize our array with zeros, as the largest subset with zero '0's and '1's is an empty subset, hence zero size.
-
We iterate through each string in
strs
and count its '0's and '1's. Then, for each string, we update our arrayf
in a decreasing manner, starting fromm
to the count of '0's in the current string (a
) and fromn
to the count of '1's (b
). -
While updating, we consider two cases for each cell
f[i][j]
:- Not including the current string, which means the
f[i][j]
would remain unchanged. - Including the current string, where we have to look at the value in the cell that represents the leftover capacity (
f[i - a][j - b]
) after including this string and add 1 to that value to represent the current string being counted.
- Not including the current string, which means the
-
We choose the maximum of these two choices at every step, which ensures that we always have the largest possible subset for a given
i
andj
. -
After iterating through all strings and updating the array, the value of
f[m][n]
will give us the size of the largest subset conforming to our constraints.
This dynamic programming solution is efficient as it avoids recalculating the largest subset sizes for every combination of '0's and '1's by building upon previously computed values.
Learn more about Dynamic Programming patterns.
Solution Approach
The implementation of the solution follows the dynamic programming approach to methodically work towards the final answer. Here's an in-depth walk-through of the pattern and the algorithm used:
-
Data Structure: A two-dimensional list
f
is created with dimensions(m + 1) x (n + 1)
. In Python, this is realized as a list of lists. Each cellf[i][j]
in this array represents the size of the largest subset withi
'0's andj
'1's. The+ 1
in both dimensions is used to include the case where0
'0's or `1's are used. -
Initialization: The two-dimensional list
f
is initialized with zeroes, since the largest subset without considering any strings (and thus having0
'0's and1's) has a size of
0`. -
Counting '0's and '1's: For each string
s
instrs
,s.count("0")
ands.count("1")
are called to count the number of '0's(a)
and '1's(b)
respectively. -
Updating the DP Table: We iterate over the list in reverse for
i
fromm
toa - 1
andj
fromn
tob - 1
. We do this because we want to make sure that when we account for a new string, we are not overwriting cells that could affect the calculation of cells later in the iteration. This is a common technique in dynamic programming known as avoiding "state contamination." -
Choice: At each cell
f[i][j]
, we attempt to include the current string. To do this, we compare the existing valuef[i][j]
(not including the current string) withf[i - a][j - b] + 1
(including the string).f[i - a][j - b]
represents the largest subset possible with the remaining capacity after including the current string. We add1
because we are including the current string in our subset. -
Taking the Maximum: We use Python's
max
function to always store the maximum of the two values. Thus,f[i][j]
will always hold the size of the largest subset for the specific capacity represented byi
andj
. -
Result: After completely filling the two-dimensional list,
f[m][n]
will give us the maximum size of our desired subset since it represents the size of the largest subset under the full capacity ofm
'0's andn
'1's.
This implementation successfully leverages the central ideas of dynamic programming, namely optimal substructure (solving bigger problems by relying on the solutions to smaller problems) and overlapping subproblems (saving computation by storing intermediate results). The use of a two-dimensional DP table is crucial, as it allows tracking the state of the problem (how many '0's and '1's can still be included) at each step.
Ready to land your dream job?
Unlock your dream job with a 2-minute evaluator for a personalized learning plan!
Start EvaluatorExample Walkthrough
Let's take a small example to illustrate the solution approach.
Assume we have the following array of binary strings strs
: ["10", "0001", "111001", "1", "0"]
, and our integer constraints are m = 5
and n = 3
. This means we cannot have more than 5 '0's and 3 '1's in our subset.
- Initialization: We create a two-dimensional list
f
with dimensions(m + 1) x (n + 1)
, sof
would be a 6x4 matrix, as we include zero counts. It's filled with zeros, like so:
f = [ [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0] ]
-
Counting and updating: We go through each string in
strs
and update our DP table.- For the string
"10"
, we havea=1
(the number of '0's) andb=1
(the number of '1's). We update cells in the range ofi = 5 to 1
andj = 3 to 1
, comparing the existing valuef[i][j]
withf[i - 1][j - 1] + 1
(sincea=1
andb=1
).
After this string, our DP table update looks like this:
- For the string
f = [ [0, 0, 0, 0], // No strings considered yet [0, 1, 1, 1], // '10' considered for i=1 (up to 1 '0's used) [0, 1, 1, 1], [0, 1, 1, 1], [0, 1, 1, 1], [0, 1, 1, 1] ]
- Applying a similar process for other strings:
"0001"
,"111001"
,"1"
, and"0"
, updatingf
for the '0's and '1's in each and choosing the max value for each cell.
- Final DP Table: Once we process all strings, our DP table will display the maximum number of strings that can be included for a given
i
number of '0's andj
number of '1's. For our case, we getf[m][n]
as the result. Let's assume our final table after all updates looks like this:
f = [ [0, 0, 0, 0], [0, 1, 1, 1], [1, 1, 2, 2], [1, 2, 2, 2], [1, 2, 3, 3], [1, 2, 3, 4] ]
- Result: From the last entry
f[5][3]
, we see that the maximum size of the subset we can get under the given constraints is 4.
Thus, with m = 5
and n = 3
, we are able to include four strings from the array strs
in our subset without exceeding the number of '0's and '1's allowed. The subset, in this case, could be ["10", "0001", "1", "0"]
, which includes 4 strings, adheres to the constraints (`5 '0's and 3 '1's), and is the largest possible subset for these constraints.
Solution Implementation
1from typing import List
2
3class Solution:
4 def findMaxForm(self, strings: List[str], max_zeros: int, max_ones: int) -> int:
5 # Initialize the DP table with dimensions (max_zeros + 1) by (max_ones + 1)
6 dp = [[0] * (max_ones + 1) for _ in range(max_zeros + 1)]
7
8 # Iterate through each string in the input list
9 for s in strings:
10 # Count the number of zeros and ones in the current string
11 zero_count, one_count = s.count("0"), s.count("1")
12
13 # Iterate over the DP table in reverse to avoid using a result before it's updated
14 for zeros in range(max_zeros, zero_count - 1, -1):
15 for ones in range(max_ones, one_count - 1, -1):
16 # Update the DP table value for the current subproblem
17 dp[zeros][ones] = max(dp[zeros][ones], dp[zeros - zero_count][ones - one_count] + 1)
18
19 # The answer is the value corresponding to using maximum zeros and ones
20 return dp[max_zeros][max_ones]
21
1public class Solution {
2
3 // The main function to find maximum number of strings that can be formed with m zeros and n ones
4 public int findMaxForm(String[] strs, int m, int n) {
5 // Initialize a DP table where f[i][j] will represent the max number of strings that can be formed with i zeros and j ones
6 int[][] dp = new int[m + 1][n + 1];
7
8 // Iterate through each string in the input list
9 for (String s : strs) {
10 // Count the number of zeros and ones in the current string
11 int[] count = countZerosAndOnes(s);
12
13 // Loop over the dp array from bottom up considering the current string's zeros and ones
14 for (int i = m; i >= count[0]; --i) {
15 for (int j = n; j >= count[1]; --j) {
16 // Update the dp value with the higher value between the current and the new computed one
17 dp[i][j] = Math.max(dp[i][j], dp[i - count[0]][j - count[1]] + 1);
18 }
19 }
20 }
21
22 // Return the result from the DP table which is the maximum number of strings that can be formed
23 return dp[m][n];
24 }
25
26 // Helper function to count the number of zeros and ones in a string
27 private int[] countZerosAndOnes(String s) {
28 // Initialize a count array where the first element is the number of zeros and the second is the number of ones
29 int[] count = new int[2];
30
31 // Iterate through the characters of the string
32 for (int i = 0; i < s.length(); ++i) {
33 // Increment the respective count (0 or 1) based on the current character
34 ++count[s.charAt(i) - '0'];
35 }
36
37 // Return the count array
38 return count;
39 }
40}
41
1#include <vector>
2#include <string>
3#include <algorithm>
4#include <cstring> // For memset
5
6using namespace std;
7
8class Solution {
9public:
10 int findMaxForm(vector<string>& strs, int m, int n) {
11 // Create a 2D array (dp) with dimensions m+1 and n+1
12 // Initialize all elements to zero
13 int dp[m + 1][n + 1];
14 memset(dp, 0, sizeof(dp));
15
16 // Iterate over each string in the given vector 'strs'
17 for (auto& str : strs) {
18 // Count the number of zeroes and ones in the current string
19 pair<int, int> zeroOneCount = countZeroesAndOnes(str);
20 int zeroes = zeroOneCount.first;
21 int ones = zeroOneCount.second;
22
23 // Iterate over the matrix in reverse, to avoid over-counting
24 // when using previously computed sub-solutions
25 for (int i = m; i >= zeroes; --i) {
26 for (int j = n; j >= ones; --j) {
27 // Update the dp matrix by taking the maximum between:
28 // 1. Current cell value (previous computed max)
29 // 2. Value computed by including the current string
30 // Add 1 to the subproblem solution because
31 // we are including one more string
32 dp[i][j] = max(dp[i][j], dp[i - zeroes][j - ones] + 1);
33 }
34 }
35 }
36 // Return the maximum number of strings that can be formed
37 // with given 'm' zeroes and 'n' ones
38 return dp[m][n];
39 }
40
41private:
42 // Helper function to count the number of zeroes and ones in a string
43 pair<int, int> countZeroesAndOnes(string& str) {
44 int countZeroes = count_if(str.begin(), str.end(), [](char c) { return c == '0'; });
45 // First of the pair is number of zeroes, second is the number of ones
46 // Since the total length minus zeroes gives the number of ones
47 return {countZeroes, static_cast<int>(str.size()) - countZeroes};
48 }
49};
50
1function findMaxForm(strings: string[], zeroLimit: number, oneLimit: number): number {
2 // Initialize a memoization table with dimensions (zeroLimit + 1) x (oneLimit + 1).
3 // This table will help us keep track of the maximum number of strings we can include
4 // given a specific limit of zeroes and ones.
5 const dpTable = Array.from({ length: zeroLimit + 1 }, () =>
6 Array.from({ length: oneLimit + 1 }, () => 0)
7 );
8
9 // A helper function to count the number of zeroes and ones in a string.
10 // It returns a tuple [zeroCount, oneCount].
11 const countZeroesAndOnes = (str: string): [number, number] => {
12 let zeroCount = 0;
13 for (const char of str) {
14 if (char === '0') {
15 zeroCount++;
16 }
17 }
18 return [zeroCount, str.length - zeroCount];
19 };
20
21 // Iterate through each string in the input array.
22 for (const str of strings) {
23 // Count the number of zeroes and ones in the current string.
24 const [zeroes, ones] = countZeroesAndOnes(str);
25
26 // Update the dpTable in reverse to avoid overwriting data we still need to use.
27 for (let i = zeroLimit; i >= zeroes; --i) {
28 for (let j = oneLimit; j >= ones; --j) {
29 // The maximum number of strings that can be included is either the current count
30 // or the count obtained by including the current string plus the count of strings
31 // that can be included with the remaining zeroes and ones.
32 dpTable[i][j] = Math.max(dpTable[i][j], dpTable[i - zeroes][j - ones] + 1);
33 }
34 }
35 }
36
37 // The final result is stored in dpTable[zeroLimit][oneLimit], reflecting the maximum number
38 // of strings we can include given the original zeroLimit and oneLimit.
39 return dpTable[zeroLimit][oneLimit];
40}
41
Time and Space Complexity
Time Complexity:
The time complexity of the given solution is O(k * m * n)
, where k
is the length of the input list strs
, m
is the maximum number of zeroes, and n
is the maximum number of ones that our subsets from strs
can contain. This complexity arises because we iterate over all strings in strs
, and for each string, we iterate through a 2D array of size m * n
in a nested loop fashion.
Space Complexity:
The space complexity of the solution is O(m * n)
, as we are constructing a 2D array f
with m + 1
rows and n + 1
columns to store the intermediate results for dynamic programming. No other data structures are used that grow with the size of the input, so the space complexity is directly proportional to the size of the 2D array.
Learn more about how to find time and space complexity quickly using problem constraints.
Which data structure is used in a depth first search?
Recommended Readings
What is Dynamic Programming Prerequisite DFS problems dfs_intro Backtracking problems backtracking Memoization problems memoization_intro Pruning problems backtracking_pruning Dynamic programming is an algorithmic optimization technique that breaks down a complicated problem into smaller overlapping sub problems in a recursive manner and uses solutions to the sub problems to construct a solution
LeetCode Patterns Your Personal Dijkstra's Algorithm to Landing Your Dream Job The goal of AlgoMonster is to help you get a job in the shortest amount of time possible in a data driven way We compiled datasets of tech interview problems and broke them down by patterns This way we
Recursion Recursion is one of the most important concepts in computer science Simply speaking recursion is the process of a function calling itself Using a real life analogy imagine a scenario where you invite your friends to lunch https algomonster s3 us east 2 amazonaws com recursion jpg You first
Want a Structured Path to Master System Design Too? Don’t Miss This!