Given a block of text like the following:
remove all the
BEGIN:VEVENT
GEO:48.1667\;-123.1167
TRANSP:TRANSPARENT
LOCATION:Dungeness\, Washington
DTSTART:20080608T131700Z
UID:D56BE4D5-F9A3-4026-B99A-C2D979639220
SUMMARY:High Tide 1.83 meters
DTSTAMP:20080617T002129Z
END:VEVENT
BEGIN:VEVENT
GEO:48.1667\;-123.1167
TRANSP:TRANSPARENT
LOCATION:Dungeness\, Washington
DTSTART:20081012T013000Z
UID:373CB6BD-F894-4826-8D04-6683AADFB4C4
SUMMARY:Sunset
DTSTAMP:20080617T002129Z
END:VEVENT
BEGIN:VEVENT
GEO:48.1667\;-123.1167
TRANSP:TRANSPARENT
LOCATION:Dungeness\, Washington
DTSTART:20080125T035500Z
UID:1EAC4C71-23B1-456D-9302-1436E407B84E
SUMMARY:Moonrise
DTSTAMP:20080617T002129Z
END:VEVENT
BEGIN:VEVENT
GEO:48.1667\;-123.1167
TRANSP:TRANSPARENT
LOCATION:Dungeness\, Washington
DTSTART:20080920T081500Z
UID:CF306FB5-480D-433D-9E2C-34569BE0A654
SUMMARY:Low Tide -0.33 meters
DTSTAMP:20080617T002129Z
END:VEVENTSUMMARY:Moonrise|Sunrise|High Tide VEVENT
mentions, leaving just the low tides.
This (?s)BEGIN:VEVENT??.*?END:VEVENT
will find just the VEVENT items.
This (?s)BEGIN:VEVENT.*?SUMMARY:Low Tide.*?END:VEVENT
gets too much: it grabs everything from the first instance of BEGIN:VEVENT
to the END:VEVENT
after the Low Tide, no matter how many other events get collected. Looks I need a look-behind: find the END:VEVENT
and the Low Tide that came just before it, and then, everything back to the BEGIN:VEVENT
.
And ideally, I just pull out the minus tides, especially if I have to go that far (anything that includes a ferry ride needs to be carefully considered).
After a lot of back and forth with a perl guru (I really get tripped up by this stuff), it was clear that I was trying to do much in one pass (better to pull out the events, then extract the ones we want, all with the default delimiter/linebreak turned off). So what I ended up with appears below the fold. I had the regex right (those have always been my bête noire) but I had no idea what to do with what I was getting.
#!/usr/bin/perl
local $/ = undef;
my @low_tides = ();
while (<>) {
my @header = /(BEGIN:VCALENDAR.*?METHOD:PUBLISH\r?\n)/gs;
my @events = /(BEGIN:VEVENT.*?END:VEVENT\r?\n)/gs;
my $footer = "END:VCALENDAR\n"
push(@ical, @header);
push(@ical, grep { /SUMMARY:Low Tide.*-\d/ } @events);
push(@ical, $footer);
}
# Now @low_tides is an array of strings, each one containing just the
# BEGIN:VEVENT through END:VEVENT lines of a single low tide event.
my $s = @ical == 1 ? '' : 's';
print for (@ical);
And now I have a reliable calendar of tide events that I can share.