|
| 1 | +<!--?title Balanced bracket sequences --> |
| 2 | +# Balanced bracket sequences |
| 3 | + |
| 4 | +A **balanced bracket sequence** is a string consisting of only brackets, such that this sequence, when inserted certain numbers and mathematical operations, gives a valid mathematical expression. |
| 5 | +Formally you can define balanced bracket sequence with: |
| 6 | + |
| 7 | +- $e$ (the empty string) is a balanced bracket sequence. |
| 8 | +- if $s$ is a balanced bracket sequence, then so is $(s)$. |
| 9 | +- if $s$ and $t$ are balanced bracket sequences, then so is $s t$. |
| 10 | + |
| 11 | +For instance $(())()$ is a balanced bracket sequence, but $())($ is not. |
| 12 | + |
| 13 | +Of course you can define other bracket sequences also with multiple bracket types in a similar fashion. |
| 14 | + |
| 15 | +In this article we discuss some classic problems involving balanced bracket sequences (for simplicity we will only call them sequences): validation, number of sequences, finding the lexicographical next sequence, generating all sequences of a certain size, finding the index of sequence, and generating the $k$-th sequences. |
| 16 | +We will also discuss two variations for the problems, the simpler version when only one type of brackets is allowed, and the harder case when there are multiple types. |
| 17 | + |
| 18 | +## Balance validation |
| 19 | + |
| 20 | +We want to check if a given string is balanced or not. |
| 21 | + |
| 22 | +At first suppose there is only one type of bracket. |
| 23 | +For this case there exists a very simple algorithm. |
| 24 | +Let $\text{depth}$ be the current number of open brackets. |
| 25 | +Initially $\text{depth} = 0$. |
| 26 | +We iterate over all character of the string, if the current bracket character is an opening bracket, then we increment $\text{depth}$, otherwise we decrement it. |
| 27 | +If at any time the variable $\text{depth}$ gets negative, or at the end it is different from $0$, than the string is not a balances sequence. |
| 28 | +Otherwise it is. |
| 29 | + |
| 30 | +If there are several bracket types involved, then the algorithm needs to be changes. |
| 31 | +Instead of a counter $\text{depth}$ we create a stack, in which we will store all opening brackets that we meet. |
| 32 | +If the current bracket character is an opening one, we put it onto the stack. |
| 33 | +If is is a closing one, then we check if the stack is non-empty, and if the top element of the stack is of the same type as the current closing bracket. |
| 34 | +If both conditions are fulfilled, then we remove the opening bracket from the stack. |
| 35 | +If at any time one of the conditions is not fulfilled, or at the end the stack is not empty, then the string is not balanced. |
| 36 | +Otherwise it is. |
| 37 | + |
| 38 | +## Number of balanced sequences |
| 39 | + |
| 40 | +### Formula |
| 41 | + |
| 42 | +The number of balanced bracket sequences with only one bracket type can be calculated using the [Catalan numbers](./combinatorics/catalan-numbers.html). |
| 43 | +The number of balanced bracket sequences of length $2n$ ($n$ pairs of brackets) is: |
| 44 | +$$\frac{1}{n+1} \binom{2n}{n}$$ |
| 45 | + |
| 46 | +If we allow $k$ types of brackets, then each pair be of any of the $k$ types (independently of the others), thus the number of balanced bracket sequences is: |
| 47 | +$$\frac{1}{n+1} \binom{2n}{n} k^n$$ |
| 48 | + |
| 49 | +### Dynamic programming |
| 50 | + |
| 51 | +On the other hand these numbers can be computed using **dynamic programming**. |
| 52 | +Let $d[n]$ be the number of regular bracket sequences with $n$ pairs of bracket. |
| 53 | +Note that in the first position there is always an opening bracket. |
| 54 | +And somewhere later is the corresponding closing bracket of the pair. |
| 55 | +It is clear that inside this pair there is a balanced bracket sequence, and similarly after this pair there is a balanced bracket sequence. |
| 56 | +So to compute $d[n]$, we will look at how many balanced sequences of $i$ pairs of brackets are inside this first bracket pair, and how many balanced sequences with $n-1-i$ pairs are after this pair. |
| 57 | +Consequently the formula has the form: |
| 58 | +$$d[n] = \sum_{i=0}^{n-1} d[i] \cdot d[n-1-i]$$ |
| 59 | +The initial value for this recurrence is $d[0] = 1$. |
| 60 | + |
| 61 | +## Finding the lexicographical next balanced sequence |
| 62 | + |
| 63 | +Here we only consider the case with one valid bracket type. |
| 64 | + |
| 65 | +Given a balanced sequence, we have to find the next (in lexicographical order) balanced sequence. |
| 66 | + |
| 67 | +It should be obvious, that we have to find the rightmost opening bracket, which we can replace by a closing bracket without violation the condition, that there are more closing brackets than opening brackets up to this position. |
| 68 | +After replacing this position, we can fill the remaining part of the string with the lexicographically minimal one: i.e. first with as much opening brackets as possible, and then fill up the remaining positions with closing brackets. |
| 69 | +In other words we try to leave a long as possible prefix unchanged, and the suffix gets replaced by the lexicographically minimal one. |
| 70 | + |
| 71 | +To find this position, we can iterate over the character from right to left, and maintain the balance $\text{depth}$ of open and closing brackets. |
| 72 | +When we meet an opening brackets, we will decrement $\text{depth}$, and when we meet a closing bracket, we increase it. |
| 73 | +If we are at some point meet an opening bracket, and the balance after processing this symbol is positive, then we have found the rightmost position that we can change. |
| 74 | +We change the symbol, compute the number of opening and closing brackets that we have to add to the right side, and arrange them in the lexicographically minimal way. |
| 75 | + |
| 76 | +If we find do suitable position, then this sequence is already the maximal possible one, and there is no answer. |
| 77 | + |
| 78 | +```cpp next_balanced_brackets_sequence |
| 79 | +bool next_balanced_sequence(string & s) { |
| 80 | + int n = s.size(); |
| 81 | + int depth = 0; |
| 82 | + for (int i = n - 1; i >= 0; i--) { |
| 83 | + if (s[i] == '(') |
| 84 | + depth--; |
| 85 | + else |
| 86 | + depth++; |
| 87 | + |
| 88 | + if (s[i] == '(' && depth > 0) { |
| 89 | + depth--; |
| 90 | + int open = (n - i - 1 - depth) / 2; |
| 91 | + int close = n - i - 1 - open; |
| 92 | + string next = s.substr(0, i) + ')' + string(open, '(') + string(close, ')'); |
| 93 | + s.swap(next); |
| 94 | + return true; |
| 95 | + } |
| 96 | + } |
| 97 | + return false; |
| 98 | +} |
| 99 | +``` |
| 100 | +
|
| 101 | +This function computes in $O(n)$ time the next balanced bracket sequence, and returns false if there is no next one. |
| 102 | +
|
| 103 | +## Finding all balanced sequences |
| 104 | +
|
| 105 | +Sometimes it is required to find and output all balanced bracket sequences of a specific length $n$. |
| 106 | +
|
| 107 | +To generate then, we can start with the lexicographically smallest sequence $((\dots(())\dots))$, and then continue to find the next lexicographically sequences with the algorithm described in the previous section. |
| 108 | +
|
| 109 | +However, if the length of the sequence is not very long (e.g. $n$ smaller than $12$), then we can also generate all permutations conveniently with the C++ STL function `next_permutation`, and check each one for balanceness. |
| 110 | +
|
| 111 | +Also they can be generate using the ideas we used for counting all sequences with dynamic programming. |
| 112 | +We will discuss the ideas in the next two sections. |
| 113 | +
|
| 114 | +## Sequence index |
| 115 | +
|
| 116 | +Given a balanced bracket sequence with $n$ pairs of brackets. |
| 117 | +We have to find its index in the lexicographically ordered list of all balanced sequences with $n$ bracket pairs. |
| 118 | +
|
| 119 | +Let's define an auxiliary array $d[i][j]$, where $i$ is the length of the bracket sequence (semi-balanced, each closing bracket has a corresponding opening bracket, but not every opening bracket has necessarily a corresponding closing one), and $j$ is the current balance (difference between opening and closing brackets). |
| 120 | +$d[i][j]$ is the number of such sequences that fit the parameters. |
| 121 | +We will calculate these numbers with only one bracket type. |
| 122 | +
|
| 123 | +For the start value $i = 0$ the answer is obvious: $d[0][0] = 1$, and $d[0][j] = 0$ for $j > 0$. |
| 124 | +Now let $i > 0$, and we look at the last character in the sequence. |
| 125 | +If the last character was an opening bracket $($, then the state before was $(i-1, j-1)$, if it was a closing bracket $)$, then the previous state was $(i-1, j+1)$. |
| 126 | +Thus we obtain the recursion formula: |
| 127 | +$$d[i][j] = d[i-1][j-1] + d[i-1][j+1]$$ |
| 128 | +$d[i][j] = 0$ holds obviously for negative $j$. |
| 129 | +Thus we can compute this array in $O(n^2)$. |
| 130 | +
|
| 131 | +Now let us generate the index for a given sequence. |
| 132 | +
|
| 133 | +First let there be only one type of brackets. |
| 134 | +We will us the counter $\text{depth}$ which tells us how nested we currently are, and iterate over the characters of the sequence. |
| 135 | +If the current character $s[i]$ is equal to $($, then we increment $\text{depth}$. |
| 136 | +If the current character $s[i]$ is equal to $)$, then we must add $d[2n-i-1][\text{depth}+1]$ to the answer, taking all possible endings starting with a $($ into account (which are lexicographically smaller sequences), and then decrement $\text{depth}$. |
| 137 | +
|
| 138 | +New let there be $k$ different bracket types. |
| 139 | +
|
| 140 | +Thus, when we look at the current character $s[i]$ before recomputing $\text{depth}$, we have to go through all bracket types that are smaller than the current character, and try to put this bracket into the current position (obtaining a new balance $\text{ndepth} = \text{depth} \pm 1$), and add the number of ways to finish the sequence (length $2n-i-1$, balance $ndepth$) to the answer: |
| 141 | +$$d[2n - i - 1][\text{ndepth}] \cdot k^{\frac{2n - i - 1 - ndepth}{2}}$$ |
| 142 | +This formula can be derived as follows: |
| 143 | +First we "forget" that there are multiple bracket types, and just take the answer $d[2n - i - 1][\text{ndepth}]$. |
| 144 | +Now we consider how the answer will change is we have $k$ types of brackets. |
| 145 | +We have $2n - i - 1$ undefined positions, of which $\text{ndepth}$ are already predetermined because of the opening brackets. |
| 146 | +But all the other brackets ($(2n - i - i - \text{ndepth})/2$ pairs) can be of any type, therefore we multiply the number by such a power of $k$. |
| 147 | +
|
| 148 | +## Finding the $k$-th sequence |
| 149 | +
|
| 150 | +Let $n$ be the number of bracket pairs in the sequence. |
| 151 | +We have to find the $k$-th balanced sequence in lexicographically sorted list of all balanced sequences for a given $k$. |
| 152 | +
|
| 153 | +As in the previous section we compute the auxiliary array $d[i][j]$, the number of semi-balanced bracket sequences of length $i$ with balance $j$. |
| 154 | +
|
| 155 | +First, we start with only one bracket type. |
| 156 | +
|
| 157 | +We will iterate over the characters in the string we want to generate. |
| 158 | +As in the previous problem we store a counter $\text{depth}$, the current nesting depth. |
| 159 | +In each position we have to decide if we use an opening of a closing bracket. |
| 160 | +To have to put an opening bracket character, it $d[2n - i - 1][\text{depth}+1] \ge k$. |
| 161 | +We increment the counter $\text{depth}$, and move on to the next character. |
| 162 | +Otherwise we decrement $k$ by $d[2n - i - 1][\text{depth}+1]$, put a closing bracket and move on. |
| 163 | +
|
| 164 | +```cpp kth_balances_bracket |
| 165 | +string kth_balanced(int n, int k) { |
| 166 | + vector<vector<int>> d(2*n+1, vector<int>(n+1, 0)); |
| 167 | + d[0][0] = 1; |
| 168 | + for (int i = 1; i <= 2*n; i++) { |
| 169 | + d[i][0] = d[i-1][1]; |
| 170 | + for (int j = 1; j < n; j++) |
| 171 | + d[i][j] = d[i-1][j-1] + d[i-1][j+1]; |
| 172 | + d[i][n] = d[i-1][n-1]; |
| 173 | + } |
| 174 | +
|
| 175 | + string ans; |
| 176 | + int depth = 0; |
| 177 | + for (int i = 0; i < 2*n; i++) { |
| 178 | + if (depth + 1 <= n && d[2*n-i-1][depth+1] >= k) { |
| 179 | + ans += '('; |
| 180 | + depth++; |
| 181 | + } else { |
| 182 | + ans += ')'; |
| 183 | + if (depth + 1 <= n) |
| 184 | + k -= d[2*n-i-1][depth+1]; |
| 185 | + depth--; |
| 186 | + } |
| 187 | + } |
| 188 | + return ans; |
| 189 | +} |
| 190 | +``` |
| 191 | + |
| 192 | +Now let there be $k$ types of brackets. |
| 193 | +The solution will only differ slightly in that we have to multiply the value $d[2n-i-1][\text{ndepth}]$ by $k^{(2n-i-1-\text{ndepth})/2}$ and take into account that there can be different bracket types for the next character. |
| 194 | + |
| 195 | +Here is an implementation using two types of brackets: round and square: |
| 196 | + |
| 197 | +```cpp kth_balances_bracket_multiple |
| 198 | +string kth_balanced2(int n, int k) { |
| 199 | + vector<vector<int>> d(2*n+1, vector<int>(n+1, 0)); |
| 200 | + d[0][0] = 1; |
| 201 | + for (int i = 1; i <= 2*n; i++) { |
| 202 | + d[i][0] = d[i-1][1]; |
| 203 | + for (int j = 1; j < n; j++) |
| 204 | + d[i][j] = d[i-1][j-1] + d[i-1][j+1]; |
| 205 | + d[i][n] = d[i-1][n-1]; |
| 206 | + } |
| 207 | + |
| 208 | + string ans; |
| 209 | + int depth = 0; |
| 210 | + stack<char> st; |
| 211 | + for (int i = 0; i < 2*n; i++) { |
| 212 | + // '(' |
| 213 | + if (depth + 1 <= n) { |
| 214 | + int cnt = d[2*n-i-1][depth+1] << ((2*n-i-1-depth-1) / 2); |
| 215 | + if (cnt >= k) { |
| 216 | + ans += '('; |
| 217 | + st.push('('); |
| 218 | + depth++; |
| 219 | + continue; |
| 220 | + } |
| 221 | + k -= cnt; |
| 222 | + } |
| 223 | + |
| 224 | + // ')' |
| 225 | + if (depth && st.top() == '(') { |
| 226 | + int cnt = d[2*n-i-1][depth-1] << ((2*n-i-1-depth+1) / 2); |
| 227 | + if (cnt >= k) { |
| 228 | + ans += ')'; |
| 229 | + st.pop(); |
| 230 | + depth--; |
| 231 | + continue; |
| 232 | + } |
| 233 | + k -= cnt; |
| 234 | + } |
| 235 | + |
| 236 | + // '[' |
| 237 | + if (depth + 1 <= n) { |
| 238 | + int cnt = d[2*n-i-1][depth+1] << ((2*n-i-1-depth-1) / 2); |
| 239 | + if (cnt >= k) { |
| 240 | + ans += '['; |
| 241 | + st.push('['); |
| 242 | + depth++; |
| 243 | + continue; |
| 244 | + } |
| 245 | + k -= cnt; |
| 246 | + } |
| 247 | + |
| 248 | + // ']' |
| 249 | + ans += ']'; |
| 250 | + st.pop(); |
| 251 | + depth--; |
| 252 | + } |
| 253 | + return ans; |
| 254 | +} |
| 255 | +``` |
0 commit comments