Description
  • Each line of File1 is a key consisting of only digits.
  • Each line of File2 is a record beginning with a key followed by a blank and data.
We want to list records whose keys are not listed in File1.
Raw Input
File1: list of keys.
4164754
5859999
6123574
5851388
3214587
Raw Input Desired Output
File2: Records.
9874563 The number calling is 111
5851388 The number calling is OK
6237733 the number is wrong
5859999 The number calling is ok
9874563 The number calling is 111
6237733 the number is wrong
Script and Comments
Script1
[ 1] :loop
[ 2] 1,/\n[0-9]+ [^\n]*$/{
[ 3] /\n[0-9]+ [^\n]*$/!{
[ 4] N
[ 5] b loop
[ 6] }
[ 7] h
[ 8] s/\n[^\n]*$//
[ 9] x
[10] s/^.*\n//
[11] }
[12] G
[13] /^([0-9]+) .*\n\1(\n|$)/!P
[14] d
Comments
  1. The `-r' option of GNU sed must be used to interpret REs as EREs.
  2. The Pattern Space and the Hold Space are abbreviated to `PS' and `HS', respectively.
  3. To feed File1 then File2 to the same sed, run sed like
    sed -r -f script_file File1 File2.
  4. The script uses the following approach:
    • Read all keys then keeps them in the Hold Space, where keys are separated by newlines.
    • After reading a record to the Patten Space, append the list of keys from the Hold Space, then print the record if its key is not in the list.
  5. The address of Step [2] makes
    • the block consisting of Steps [2] thru [11] apply ONLY to lines of File1 and the first one of File2, and
    • Steps [12] thru [14] apply ONLY to lines of File2.
  6. After Step [6], PS contains all keys and the first line of File2.
  7. After Step [11], PS contains the first line of File2, HS contains all keys separated by newlines.
  8. Step [12] thru [14] constitute an implicit loop where
    • Step [12] attaches the list of keys from HS to the end of PS via command `G'.
    • Step [13] prints the record in question if its key is not in the list.
    • Step [14] deletes the record and start a new cycle to read the next.