Description
When deleting a user account from a Linux/Unix system, system administrators sometimes forget to delete that user from /etc/group. This motivates the development of this script.

This script requires two input files: the passwd and the group files. Since each line of the passwd file contains 7 colon-separated fields while each line of the group file contains only 4 ones. We can use this to figure out whether a line should be interpreted as a passwd record or a group one.

The sample passwd file is:

root:x:0:0:root:/root:/bin/bash bin:x:1:1:bin:/bin:/sbin/nologin solomon:x:500:500::/home/changyj:/bin/bash qoo:x:501:100::/home/qoo:/bin/bash joshua:x:503:100::/home/joshua:/bin/bash abigail:x:504:100::/home/abigail:/bin/bash eric:x:504:100::/home/eric:/bin/bash jonathan:x:504:100::/home/jonathan:/bin/bash johnson:x:505:100::/home/johnson:/bin/bash tina:x:506:100::/home/tina:/bin/bash tracy:x:507:100::/home/tracy:/bin/bash

and the sample group file is:

root:x:0:root sales:x:501:johnson,joshua,kelly grp01:x:502:eric,johnson,kelly,ping,abigail,jonathan,tracy,qoo grp02:x:503:john,qoo,ping,solomon grp03:x:504:solomon,david,eric,qoo adm:x:505:solomon,tina,ping,abigail,david
where the names of non-existent users are shown in red.

We want to remove these non-existent users from the group file.

Script and Comments
Script1
[ 1] /^([^:]*:){6}/{
[ 2] s/:.*$//
[ 3] H
[ 4] d
[ 5] }
[ 6] G
[ 7] s/^.*:/&\n/
[ 8] :loop
[ 9] s/\n([^\n,]+)(,[^\n]+|)(\n.*)\n\1(\n|$)/\1\n\2\3\4/
[10] /^[^\n]+\n[,\n]/!s/\n[^\n,]+/\n/
[11] /\n\n(\n[^\n]+)*$/{
[12] P
[13] d
[14] }
[15] s/\n,/,\n/
[16] b loop
Comments -r
  1. This script uses the following approach:
    • First reads every line of the passwd file to build a list of usernames.
      Note that this list is kept in HS.
    • Then examine each member of every group to see whether that member exists in that list or not.
  2. You have to feed to sed first the passwd file followed by the group file in the same command line like
    sed -r -f this_script passwd_file group_file.
  3. Steps [1] thru [5] constitute a implicit loop which reads all lines of the passwd file to build a list of usernames from the passwd file. According to the sample input, the contents of the list are
    \n root \n bin \n \n solomon \n qoo \n joshua \n abigail \n eric \n jonathan \n johnson \n tina \n tracy
    where \n stands for the newline character and spaces are inserted here to enhance the readability.
  4. In Step [6],
    • Command `G' appends the username list kept in HS to PS, and separates it from the original data with a newline character.
    • Remember that the list kept in HS begins with a newline character.
    • Therefore, after this step the original data is followed by two newline character.
    • For example, grp03:x:504:solomon,david,eric,qoo\n\nroot\nbin\n....
  5. Step [7] inserts before the first member of a group a newline character which is used as a `mark'.
  6. Steps [8] thru [16] constitute a loop which will examine one member per iteration to figure out whether to keep or remove it. The member to be checked in an iteration is the one preceded by the `mark'.
  7. Step [9]
    • tries removing the username from the user list and moving the mark after the username.
    • A successful substitution implies that member must be kept. After the substitution, the `mark' is followed by
      • either a comma, or
      • the newline character inserted by `G' of Step [6].
    • For example, PS before this step:
      grp03:x:504:\nsolomon,david,eric,qoo\n\nroot\nbin\nsolomon\nqoo...
    • PS after this step:
      grp03:x:504:solomon\n,david,eric,qoo\n\nroot\nbin\nqoo...
    • A failed substitution implies that the member should be remove. In this case, Step [10] will remove it.
  8. Moreover, we need a way to know whether the last member has been examined:
    • Assume that a group has qoo as it last member:
      Case PS before Step [9,10] PS after Step [9,10]
      User qoo exists ...\nqoo\n(\nUsername)*\nqoo(\nUsername)*$ ...qoo\n\n(\nUsername)*$
      User qoo does NOT exist ...\nqoo\n(\nUsername)*$ ...\n\n(\nUsername)*$
    • After (re)moving the last member, PS is matched with \n\n(\n[^\n]+)*$.
  9. If the member examined in this iteration is not the last one, Step [15] moves the mark after the comma following the member, and Step [16] makes sed jump to Step [8] to examine the next member.
Script2
[ 1] /^([^:]*:){6}/{
[ 2] s/:.*$//
[ 3] H
[ 4] d
[ 5] }
[ 6] G
[ 7] s/^.*:/&\n/
[ 8] :loop
[ 9] s/\n([^\n,]+)(,[^\n]+|)(\n.*)\n\1(\n|$)/\1\n\2\3\4/
[10] /^[^\n]+\n[,\n]/!s/\n[^\n,]+/\n/
[11] /\n\n(\n[^\n]+)*$/{
[12] s/,+/,/g
[13] s/(:),|,(\n)/\1/g
[14] P
[15] d
[16] }
[17] s/\n,/,\n/
[18] b loop
Comments -r
  1. The first script does not remove the redundant commas before printing the result, for example:
    grp01:x:502:eric,,johnson,,,,abigail,,,jonathan,tracy,qoo.
    This script uses two more commands to remove them:
    • First Step [12] replaces consecutive commas with one.
    • Then Step [13] removes any comma if
      • it is preceded by :, the separator used in the group file, or
      • it is the last character before the newline character separating the member list and the user list.