Also On This Page →

Please feel free to make improvements to this document and the source code and check them into CVS. Thanks! -- David Booth

Introduction

Scribe.perl is a perl script for generating meeting minutes from an IRC, IM or other log file that follows some simple minuting conventions. It is easy to use and requires no installation. It was primarily designed for use in and around W3C, but can also be used in other environments. (For use in other environments, see the Input Formats section and the -template and -plain options.)

Quick Start Guide

Step 1: Invite RRSAgent and Zakim bot

Skip this step if you're not using W3C's IRC

IRC Command Explanation
/invite rrsagent #ws-arch #ws-arch should be your IRC channel
/invite zakim #ws-arch #ws-arch should be your IRC channel
zakim, this will be ws_arch ws_arch should be your meeting name as shown in the W3C teleconference calendar

Step 2: Start the meeting:

IRC Command Explanation
Scribe: David Booth (Optional) David Booth is the name of the scribe as it should appear in the generated minutes. The ScribeNick will be used instead if Scribe is not specified. If there are multiple scribes: Issue this command and the ScribeNick: command whenever the scribe name changes.
ScribeNick: dbooth dbooth is the IRC nickname of the scribe. The ScribeNick will be guessed if neither Scribe nor ScribeNick are specified. HINT: Do not change your IRC nickname to "scribe". If there are multiple scribes: Issue this command and the Scribe: command whenever the scribe nickname changes.
Meeting: WS Arch Teleconference Record meeting title
Chair: Mike Record who chaired

Step 3: Take notes in IRC:

IRC Command Explanation
Topic: Debate on Feature X Use "Topic: . . ." at the start of each agenda topic. Alternatively, you can use Zakim bot's agenda control, which is recognized by default. (See the -useZakimTopics option below.)
Mike: Feature X is great Record what Mike said. NOTE: Speaker's name must not contain spaces!
... and easy to implement. Mike's statement continues. (NOTE: Do not use blank space indentation alone to indicate continuation lines. That syntax was previously permitted, but was found to be too error prone -- particularly when text was pasted -- so it is no longer supported.)
ACTION: Frank to order lunch Record new action.
ACTION: Mary to write spec [PENDING] Record old action and status: DONE, PENDING, or DROPPED The action status may also be on a line by itself immediately following the action item.
[PENDING] ACTION: Mary to write spec Alternate syntax for recording action status.
RESOLUTION: Accept Frank's proposal Indicate how an issue or topic was resolved.
s/Mary/Marie/ Change most recent occurrence of "Mary" to "Marie". The old string is currently treated as a literal string -- not a regex. Alternate syntax: s|Mary|Marie|
s/Mary/Marie/g Change all previous occurrences of "Mary" to "Marie".
s/Mary/Marie/G Change all previous and future occurrences of "Mary" to "Marie" (within this document).
i/Time to vote/Topic: Vote on Feature Y Insert a "Topic: Vote on Feature Y" line before the line containing the literal string "Time to vote" (not a regex). Alternate syntax: i|Time to vote|Topic: Vote on Feature Y

Step 4: Finish the meeting

IRC Command Explanation
zakim, bye Dismiss zakim bot, which will generate a list of attendees. Use "Present: ..." (described below) instead if you aren't using zakim bot.
rrsagent, make log public (For public minutes and logs) Change the permissions on the IRC logs. Note that the permission changes are queued and it may be a minute or so before they take effect.
rrsagent, draft minutes Tell RRSAgent to generate minutes from the log as written so far, inheriting access permissions from the log permissions. Note the location of the generated minutes. Skip this step if you are not using RRSAgent.
rrsagent, bye Dismiss RRSAgent (if used).
(Download and edit the generated minutes) If you're using RRSAgent, then just edit the generated minutes and you are done.

Continue to Step 5 if you wish to run scribe.perl manually or if you are not using RRSAgent.

Step 5: Generate minutes

Shell Command Explanation
(Save a copy of the IRC log, such as http://www.w3.org/2002/04/05-arch-irc.txt) (Hint: If RRSAgent wrote the minutes to http://...foo-minutes, then the IRC log will be at http://...foo-irc.txt.)
(Download scribe.perl) No installation is needed, but you must have perl.
perl scribe.perl log.txt > minutes.html (Generate minutes.)
(Review and make adjustments.) If the result isn't good enough, either: 1. edit your copy of the log file and regenerate the HTML; or 2. manually edit the resulting HTML. Option 1 is best if you forgot to indicate who is scribe ("Scribe: ..."), or if you forgot to mark a topic start ("Topic: ...").

Pros and Cons of Using Scribe.perl

Pros

Cons

Running scribe.perl

Scribe.perl reads standard input and writes to standard output:

perl scribe.perl [options] < log.txt > minutes.html

It can also be invoked by RRSAgent from IRC (provided you're using RRSAgent):

rrsagent, draft minutes

Options

Options described below are grouped in several categories:

There are three ways to specify options (from highest to lowest priority):

Input Style Options

These options are used to accommodate the different input syntaxes and scribing styles.

-dashTopics

Indicate that dash lines are used to indicate that the next line is the start of a new topic, such as:

<Philippe> ---
<Philippe> Review of Action Items

instead of

Topic: Review of Action Items

-implicitContinuations

Indicate that the scribe used implicit continuation lines like this:

<dbooth> Mary: Now is the time
<dbooth> for all good men and women
<dbooth> to come to the aid of their party.

instead of this:

<dbooth> Mary: Now is the time
<dbooth> ... for all good men and women
<dbooth> ... to come to the aid of their party.

The implicit continuation style is not recommended, because it is ambiguous. For example, the "(Group agrees)" statement below will be incorrectly attributed to Mary, instead of being a scribe comment:

<dbooth> Mary: Now is the time
<dbooth> for all good men and women
<dbooth> to come to the aid of their party.
<dbooth> (Group agrees)

-useZakimTopics

[Default] Use Zakim bot to change topics, such as "zakim, take up next agendum". Specifically, treat Zakim statements like:

<Zakim> agendum 2. "UTF16 PR issue" taken up [from MSMscribe]

as equivalent to the command:

<scribe> Topic: UTF16 PR issue 

-noUseZakimTopics

Turn off the -useZakimTopics option.

-inputFormat NameOfFormat

Force input to be treated as NameOfFormat, which must be one of the formats listed in Input Formats. This option is not normally needed, as the input format will normally be guessed.

Output Format Options

These options control the output format.

-tidy
Pipe the output through "tidy -c". This only works if you have tidy already installed. At present, it also causes tidy's console output to be jumbled in with the console messages from scribe.perl, which makes it hard to read scribe.perl's warnings.
-scribeOnly
Only include what the scribe wrote. Discard IRC statements made by others.
-draft
[DEFAULT] Include a "- DRAFT -" header in the formatted output, to remind you that the generated minutes still need manual editing.
-final
Omit the "- DRAFT -" header in the formatted output.
-embedDiagnostics
Embed scribe.perl's diagnostic output into the generated minutes. This is most useful when scribe.perl is run as part of an automated process that otherwise would not display the diagnostic output to the user.
-noEmbedDiagnostics
[DEFAULT] Write diagnostic output to the console (stderr). However, in the case of a fatal error, diagnostic output will still be written to stdout (instead of generated minutes), to avoid producing an empty and uninformative minutes file.
-trustRRSAgent

Take the action items from what RRSAgent says when it is dismissed ("<RRSAgent> I see 9 open action items..."), rather than from the original text that was written in IRC when the action was recorded ("<dbooth> ACTION: ..."). This option has pros and cons:

Pros:

Cons:

-noTrustRRSAgent
(DEFAULT) Turn off the -trustRRSAgent option. I.e., get the action items from wherever they were initially written. This permits action items to be corrected using "s/old/new/" commands instead of RRSAgent commands.

Template Options

Scribe.perl uses templates for generating the formatted output. There are a few built in, but you can specify your own if you wish.

-plain
Use the built-in plain (non-W3C) template.
-public
(DEFAULT) Use the built-in W3C template that is styled for public access.
-member
Use the built-in W3C template that is styled for W3C member-only access.
-team
Use the built-in W3C template that is styled for W3C team-only access.
-mit
Use the built-in W3C template that is styled for W3C MIT site meetings. After scribe.perl generates your HTML, you still have to manually insert the "Two Minutes" reports. Use the Two Minutes CGI Script to generate them.
-template templateFile.html
Use templateFile.html as the template for generating the minutes.
-sampleTemplate
Show me a sample template file. (This outputs the default template.)

Scribe Identification Options

-scribe Name
Specify the name of the scribe, as it should appear in the generated minutes. If Name appears in the input as an IRC nickname, it will also be taken as the scribe's nickname, as if specified by the -scribeNick option (below). This option may be used multiple times to indicate multiple scribes.
-scribeNick nickName
Indicate the IRC nickname of the scribe, which is used to figure out which lines were written by the scribe. This option may be used multiple times to indicate multiple scribeNicks.

Miscellaneous Options

-minutes http://example.org/2005/01-baking-club-minutes
Specify the URL where you will eventually publish the generated minutes. This does not cause the minutes to be published for you! Rather it allows each ACTION item to include a pointer to its original context in the minutes, such as:
ACTION: dbooth to bake 3 cakes 
	[recorded in http://example.org/2005/01-baking-club-minutes#action01]
This option is not usually needed if RRSAgent is used, because the minutes URL will be inferred from lines such as:
<dbooth> rrsagent, draft minutes 
<RRSAgent> I have made the request to generate 
	http://www.w3.org/2005/01/07-swcg-minutes dbooth
-sampleInput
Show me some sample input, so I can learn how to use this program. Input is ignored if this option is used.
-sampleOutput
Show me some sample output. Input is ignored if this option is used.

Commands

Commands are interspersed with other minuted text in your IRC log, except that each command must be on a line by itself. Commands may be issued by anyone -- not only the scribe. Syntax is shown below by example, with italicized portions variable.

Editing Commands

These editing commands are usually the easiest way to correct simple mistakes or add clarifications. (Of course, they don't take effect until you run scribe.perl on your IRC log.)

s/old/new/
s/old/new
s|old|new|
s|old|new

Replace the most recent occurrence of old with new. Old is currently treated as a literal string (it will be escaped using quotemeta(...)) -- not a regular expression. These are processed in order, first to last. NOTE: If you correct an action item using s/old/new/, then you should not use the -trustRRSAgent option. Conversely, if you use "ACTION=..." to correct an action item using RRSAgent, then you should use the -trustRRSAgent option. For further explanation, see the -trustRRSAgent option.

s/old/new/g 
s|old|new|g

Replace globally from this point backward.

s/old/new/G 
s|old|new|G

Replace globally, both forward and backward.

i/locationString/lineToInsert
i/locationString/lineToInsert/
i|locationString|lineToInsert
i|locationString|lineToInsert|

Insert lineToInsert before the line containing locationString, which is currently treated as a literal string (it will be escaped using quotemeta(...)), not a regular expression. This is most helpful if you forgot to insert a "Topic: " command.

For example, the following use of the i// command:

<Arthur> Finished with issue LC71; on to LC82.
<Arthur> Frank: This is about syntax
<Arthur> ... Do we care about syntax?
<dbooth> i/Frank: This is about/Topic: Issue LC82

is converted to:

<Arthur> Finished with issue LC71; on to LC82.
<inserted> Topic: Issue LC82
<Arthur> Frank: This is about syntax
<Arthur> ... Do we care about syntax?

General Commands

Meeting: Baking Club

Use "Baking Club" as the title of the meeting minutes. This command should only be done once.

Chair: Jonathan

Jonathan was the meeting chair. This command should only be done once.

Scribe: Mary

Mary was the scribe. The Scribe command is to indicate the full name of the scribe. If the name given in the Scribe command matches an IRC name, then it will also be used as the ScribeNick. If you also use the ScribeNick command (below), then you can use the Scribe command to give the full name of the scribe, rather than the IRC nickname. The Scribe command can be used more than once, which indicates that multiple scribes were used during the meeting.

ScribeNick: MaryScr

MaryScr was the IRC nickname used by the scribe. Use this (and the "Scribe:" command above) if the scribe's IRC nickname was cryptic. The ScribeNick command can be used more than once, which indicates that multiple scribes were used during the meeting.

Agenda: http://www.example.com/agenda.html

Specify the agenda URL (optional). This command should only be done once.

Present: Jonathan, Mary, Barbara, Steve

Explicitly indicate who was present. If you're using Zakim bot you probably will not need to do this, because the script is usually able to figure out who was present.

Regrets: Nathan, Emma

Indicate who sent regrets (optional).

<frank> present+: Janine, Brian
<frank> present- Nathan
<frank> regrets+ Marja, Leonard
<frank> Present+

Add or remove names from the present/regrets lists. The colon is optional after + or -. With no names, it adds or removes the speaker himself, i.e., the last line adds "frank".

Date: 05 Dec 2002

Specify the meeting date. Not usually needed, because the default is auto set from the IRC log name or the current date.

ACTION: Frank, Mary and Kate to propose solution for issue 42
ACTION: David to clean the kitchen [PENDING]

Give an action item. Action recipients should be listed before the word "to". Action status may be given at the end or on the next line: [DONE], [PENDING], or [DROPPED]. Several synonyms of these status keywords are also recognized.

RESOLUTION: Issue 42 closed as duplicate of issue 21

Indicate a decision that the group has made.

Log: http://www.w3.org/2002/11/07-ws-arch-irc

Explicitly indicate the IRC log location. Not usually needed, because the script normally infers it from statements like:

<dbooth> rrsagent, where am i?
<RRSAgent> See http://www.w3.org/2002/11/07-ws-arch-irc#T13-59-36
ScribeOptions: -tidy -dashTopics -embedDiagnostics

Specify options inline, as if they had been written on the command line like:

perl scribe.perl -tidy -dashTopics -embedDiagnostics
NamedAnchorHere: foo

Cause a named anchor foo to be generated at this point in the minutes, such as:

This is mostly for use internally by scribe.perl.

Manually Editing the Log

In addition to the "realtime" editing commands above, you may manually edit your IRC log with a text editor (by mimicking the log format) before running scribe.perl. This is most useful for commands that have a broad effect on the generated HTML. For most log formats (except HTML), you can leave off the timestamps on lines that you insert.

Here is an example snippet of an IRC log in which the commands "ScribeNick: tw" and "Topic: Fire Alarms" are have been inserted using a text editor.

20:11:21 <tw> Scribe: TedWilliams
<tw> ScribeNick: tw
20:11:34 <tw> Topic: Printers
20:11:49 <tw> alan: Printers are working now
<tw> Topic: Fire Alarms
20:12:27 <tw> ralph: Fire marshall assured us that the system is working.

(Note that the IRC name <tw> is required, though for most commands it does not matter what IRC name is specified.)

Input Formats

Several log formats are recognized, and are described below:

Scribe.perl guesses which format you have used by seeing which format best matches the input. If necessary, you can specify the input format explicitly using the -inputFormat option and specifying the name of the format such as "-inputFormat RRSAgent_Text_Format". The names of the formats are actually the perl function names used in the code.

If you have a log format that is not recognized:

  1. Try doing global search/replace in a text editor, to convert it into a recognized format, such as Normalized_Format; and
  2. Please email an example to dbooth@w3.org so that I can consider adding support for it.

RRSAgent_Text_Format (RECOMMENDED)

The plain text format (*.txt) produced by RRSAgent. Hint: add ".txt" to the RRSAgent log URL to get the text version, such as: http://www.w3.org/2002/11/07-ws-arch-irc.txt . The timestamps are ignored (except in guessing the input format), so you can safely edit the log and add lines without having to fake timestamps. Example:

20:41:27 <dbooth> Mike: Feature X would benefit users.
20:41:37 <dbooth> ... and implementation would be easy.
20:41:47 <ericn> I agree.

RRSAgent_HTML_Format

The HTML format (*.html) produced by RRSAgent. This is the raw HTML code. Example:

<dt id="T20-41-27">20:41:27 [dbooth]</dt>
	<dd>Mike: Feature X would benefit users. </dd>
<dt id="T20-41-37">20:41:37 [dbooth]</dt>
	<dd>... and implementation would be easy. </dd>
<dt id="T20-41-37">20:41:47 [ericn]</dt>
	<dd>I agree. </dd>

RRSAgent_Visible_HTML_Text_Paste_Format

This is for the format that is visible in the browser when RRSAgent's HTML is displayed. I.e., when you display the *.html log in a browser, and then copy and paste the text from the browser window, only the displayed text is copied -- not the raw HTML tags. Example:

20:41:27 [dbooth]
     Mike: Feature X would benefit users.
20:41:37 [dbooth]
     ... and implementation would be easy.20:41:47 [ericn]
     I agree.

Mirc_Timestamped_Log_Format

Timestamped log format produced by mIRC. Example:

[19:35] <Zakim> Steven should now be muted
[19:36] <ph> http://lists.w3.org/Archives/Member/w3c-html-cg/2004JanMar/0038.html
[19:36] * Zakim hears Steven's hand up
[19:36] * Zakim sees Steven on the speaker queue

Mirc_Text_Format

The format produced by mIRC when you do Buffer-->Save As. Example:

<dbooth> Mike: Feature X would benefit users.
<dbooth> ... and implementation would be easy.
<dbooth> This is pretending to be a long
 line that mIRC breaks in order to display,
 but scribe.perl will re-join into a single line.
<ericn> I agree.

Normalized_Format

This is the format used internally by scribe.perl. Scribe.perl converts all other formats to this format before processing. Superficially it looks similar to the Mirc_Text_Format, except that Mirc_Text_Format may contain broken lines that still need to be rejoined back into single, long lines. Example:

<dbooth> Mike: Feature X would benefit users.
<dbooth> ... and implementation would be easy.
<dbooth> This is pretending to be a long line that mIRC breaks in order to display, but scribe.perl will re-join into a single line.
<ericn> I agree.

XChat_Timestamped_Log_Format

Timestamped log format produced by X-Chat. Example:

**** BEGIN LOGGING AT Mon Feb 14 11:02:06 2005

Feb 14 11:02:06 -->   You are now talking on &arch
Feb 14 11:02:06 ---  Topic for &arch is W3C Architecture Mardi Gras Meeting
Feb 14 11:02:06 ---    Topic for &arch set by plh at Tue Feb  8 13:43:31 2005
Feb 14 11:02:11 <Zakim>  ok, plh; the call is being made
Feb 14 11:02:18 <Zakim> WS_Team()11:00AM has now started
Feb 14 11:02:19 <Zakim>    +Plh
Feb 14 11:02:23 <larryk>   This is an on-the-record comment
Feb 14 11:02:26 *    Yves This is an off-the-record comment
**** ENDING LOGGING AT Mon Feb 14 10:22:09 2005

Irssi_ISO8601_Log_Text_Format

The format produced by Irssi logging with ISO8601 timestamp prefixes. Example:

2003-12-18T15:27:21-0500 <dbooth> Mike: Feature X would benefit users.
2003-12-18T15:27:36-0500 <dbooth> ... and implementation would be easy.
2003-12-18T15:27:36-0500 <ericn> I agree.

The date, seconds and timezone are optional, so the following is also recognized:

15:27 <dbooth> Mike: Feature X would benefit users.
15:27 <dbooth> ... and implementation would be easy.
15:27 <ericn> I agree.

Yahoo_IM_Format

The format produced when you save a Yahoo IM session. Example:

dbooth: Mike: Feature X would benefit users.
dbooth: ... and implementation would be easy.
ericn: I agree.

Plain_Text_Format

Line-oriented plain text format. This format is intended for occasions when the scribe doesn't have access to IRC, and simply takes notes in a text editor. Example:

	Mike: Feature X would benefit users.... and implementation would be easy.
Eric agrees.

Bert_IRSSI_Format

The style of IRSSI logs as used by Bert Bos, characterized by a date & time at the start and a vertical bar separating the nick name and the message. This is almost identical to the elho theme for IRSSI, which should work here as well. Here is an example:

--- Log opened Thu Mar 19 12:57:23 2015
12:57       «Users» | 23 nicks (0 ops, 0 halfops, 0 voices, 23 normal
12:57        «Mode» | Channel &global was created on Fri Apr 11 21:35:25 2014
13:08             * | koalie veronica :D
13:08           --> | naomi_ (naomi@team.cloak) has joined &global
13:10         Zakim | +Plh
13:11        scribe | Angel: Short meeting.
13:13        «Quit» | naomi (naomi@team.cloak) has signed off (Ping timeout: 180 seconds)
13:14      veronica | q?
13:15             * | Zakim sees no one on the speaker queue
13:15        scribe | Veronica: Please everyone update the FTMS records.

Environment Variables

The SCRIBEOPTIONS environment variable holds the default options that one wants to use. It is not required. Example for a Bourne-style shell:

export SCRIBEOPTIONS="-trustRRSAgent"

Author: David Booth This software is available for use under the W3C Software License. $Date$