Description
- A mail consists of header fields (`headers' for short)
and optionally followed by a message body.
- Each header may contain one or more lines
where the second and the following ones of a header must be indented
by spaces or tabs.
- The message body and the preceding header are separated by an empty line.
We want to extract every header matching a given regular expression,
^(received|subject): in this example.
|
| Raw Input
| From sed-users@yahoogroups.com Sun May 9 14:52:11 2004
Return-Path:
Received: from n11.grp.scd.yahoo.com (n11.grp.scd.yahoo.com [66.218.66.66])
by main.rtfiber.com.tw (8.11.6/8.11.6) with SMTP id i170Lq809415
for ; Sat, 7 Feb 2004 08:21:52 +0800
Received: (qmail 74534 invoked from network); 7 Feb 2004 00:21:52 -0000
To: sed-users@yahoogroups.com
Received: from unknown (HELO n17.grp.scd.yahoo.com) (66.218.66.72)
by mta2.grp.scd.yahoo.com with SMTP; 7 Feb 2004 00:21:51 -0000
Subject: Hello!
From: "john_vdv"
Welcome to the world of Regular Expressions!
|
|
| Desired Output
| Received: from n11.grp.scd.yahoo.com (n11.grp.scd.yahoo.com [66.218.66.66])
by main.rtfiber.com.tw (8.11.6/8.11.6) with SMTP id i170Lq809415
for ; Sat, 7 Feb 2004 08:21:52 +0800
Received: (qmail 74534 invoked from network); 7 Feb 2004 00:21:52 -0000
Subject: Hello!
|
|
Script and Comments
Script1 [ 1] :loop
[ 2] N
[ 3] /\n[ \t]+[^\n]*$/b loop
[ 4] h
[ 5] s/\n[^\n]*$//
[ 6] /^(received|subject): /Ip
[ 7] x
[ 8] s/^.*\n//
[ 9] /^$/q
[10] b loop
| |