Description
  • Given a LINE containing parentheses where each of them is assigned a number called `depth'.
  • We use a counter and the following procedure to determine the depth of a parenthesis:
    • The initial value of the counter is zero.
    • Scanning is performed from left to right.
    • If an opening parenthesis is reached, increase the counter by one, then the value of the counter used as the depth of that parenthesis.
    • If a closing parenthesis is reached, the current value of the counter is used as its depth, then decrease the counter by one.
  • In the following example, each parenthesis's depth is shown below it:
    A ( B ( C ( D ) E ( F ( G ) H ( I ) J ) K ) L ) M ( N ( O ( P ) R ) S ) T
      1   2   3   3   3   4   4   4   4   3   2   1   1   2   3   3   2   1 
    
    and we want to extract data enclosed by parentheses of depth 3.
  • This script assumes that data in the next line has nothing to do with the current one. For multiline version, please visit Multiline Version
Raw Input
A ( B ( C ( D ) E ( F ( G ) H ( I ) J ) K ) L ) M ( N ( O ( P ) R ) S ) T
Desired Output
( D )
( F ( G ) H ( I ) J )
( P )
Script and Comments
Script1
[ 1] /\n/!s/^|$/\n/g
[ 2] :loop
[ 3] s/\n([^()\n]*)/\1\n/
[ 4] /\n\n/d
[ 5] /\n\(/{
[ 6] s/$/#/
[ 7] /\n#{3}$/s/^[^\n]*//
[ 8] }
[ 9] /\n\)/{
[10] /\n#{3}$/{
[11] s/#$//
[12] s/\n\)/)\n\n/
[13] P
[14] D
[15] }
[16] s/#$//
[17] }
[18] s/\n([()])/\1\n/
[19] b loop
Comments -r
  1. A counter is required to determine every parenthesis' depth, whose value is
    • kept as the same number of hash signs(`#'),
    • stored at the end of PS, and separated from the original line by a newline character.
  2. Before processing a line, s/^|$/\n/g of Step [1] inserts a newline character at both the start and the end of the line, respectively. Where
    • The first newline marks the parenthesis to be processed. Step [3] moves it to the right position.
    • The last newline separates the counter from the original line.
    • When these newlines become adjacent, this implies processing of the line has been finished. Therefore, `d' of Step [4] makes sed start a new cycle.
  3. If the parenthesis in question is an opening one,
    • Step [6] increase the counter by one. The resulting value is the depth of that parenthesis.
    • If the depth matches, Step [7] deletes every thing before the first newline. Data starting from here till the closing peer will be printed by Step [13] later.
  4. If the parenthesis in question is an closing one,
    • After checking the depth, the counter has to be decreased by one. This will be performed either by Step [11] or Step [16].
    • If its depth matches, the enclosed data end here. To print the enclosed data,
      • Step [12] exchanges the parenthesis and the newline, and inserts one more newline since Step [14] will consumes one.
      • `P' of Step [13] prints the enclosed data.
      • `D' of Step [14] deletes every thing up the first newline, then makes sed jump to Step [1].