Description
Given a file whose lines are of the form
<a href=...> KEY </a>
For consecutive lines of the same key, retrieve only the first line.
|
| Raw Input
|
| Desired Output
| <a href="ind0001.html#i616">Blob</a>
<a href="ind0002.html#i5">Blob</a>
<a href="ind0004.html#i3546">Doe</a>
<a href="ind0003.html#i3556">Doe</a>
<a href="ind0001.html#k100">Newton</a>
<a href="ind0007.html#j331">Martin</a>
<a href="ind0009.html#j2479">Martin</a>
<a href="ind0008.html#l779">Martin</a>
|
| <a href="ind0001.html#i616">Blob</a>
<a href="ind0004.html#i3546">Doe</a>
<a href="ind0001.html#k100">Newton</a>
<a href="ind0007.html#j331">Martin</a>
|
|
Script and Comments
Script1 [ 1] :loop
[ 2] $!N
[ 3] />([^<]*)<.*\n.*>\1<.*/!{
[ 4] P
[ 5] D
[ 6] }
[ 7] s/\n.*//
[ 8] b loop
| |
Script2 [ 1] :loop
[ 2] $!N
[ 3] />([^<]*)<.*\n.*>\1<.*/s/\n.*//
[ 4] t loop
[ 5] P
[ 6] D
| |