| Raw Input
|
Each non-blank line of the datafile either consists of
- a tag only, or
- a tag, followed by a space, minus sign, a space, and some data
The following table describes each tag:
| BOF | begin of file |
| EOF | end of file |
| AU | author |
| TI | title |
A sample of the datafile is:
| BOF
BOR
AU - Daniel P. Bovet
AU - Macro Cesati
TI - Understanding the Linux Kernel
TI - From I/O Ports
TI - to Process Management
LA - English
EOR
BOR
AU - John E. Hopcroft
AU - Jeffery D. Ullman
TI - Introduction to Automata Theory
TI - Languages, and Computation
EOR
EOF
|
|
| Desired Output
| |
We find that in some record
(block delimited by a pair of BOR and EOR),
there exist consecutive lines which begin with the same tag.
What we want is joining these lines together and remove the duplicate tags.
For example:
| BOF
BOR
AU - Daniel P. Bovet, Macro Cesati
TI - Understanding the Linux Kernel, From I/O Ports, to Process Management
LA - English
EOR
BOR
AU - John E. Hopcroft, Jeffery D. Ullman
TI - Introduction to Automata Theory, Languages, and Computation
EOR
EOF
|
|
Script and Comments
Script1 [ 1] :loop
[ 2] N
[ 3] /^\([^ ]*\) .*\n\1 /{
[ 4] s/\n\([^-]*\)-/,/
[ 5] b loop
[ 6] }
[ 7] P
[ 8] D
| |