How can I automate repetitive find’n'replace operations?

I’m the webmaster (and a trustee) of the West Midland Bird Club, a registered charity.

Oystercatcher

Every month, I get sent a series of text files, with lists of bird sightings at each of our reserves, and some other locations. They usually comprise around thirty entries like these:

  • 6th: 1 Dunlin, 1 Oystercatcher, 2 Little Ringed Plovers, 1 Common Sandpiper, pair Shelduck, pair Greylag Geese flew over, 1 Cuckoo, pair Kingfisher, 2 Lesser Whitethroat at north end of Reserve.
  • 5th: 3 Oystercatchers, 1 Ringed Plover.

and I need to turn them into HTML markup like this:

<li class="hentry" id="D2011-05-06"><span class="entry-content"><abbr class="updated entry-title" title="2011-05-06">6th</abbr>: 1 <span class="biota bird"><b class="vernacular">Dunlin</b></span>, 1 <span class="biota bird"><b class="vernacular">Oystercatcher</b></span>, 2 <span class="biota bird"><b class="vernacular">Little Ringed Plover</b></span>, 1 <span class="biota bird"><b class="vernacular">Common Sandpiper</b></span>, pair <span class="biota bird"><b class="vernacular">Shelduck</b></span>, pair <span class="biota bird"><b class="vernacular">Greylag Geese</b></span> flew over, 1 <span class="biota bird"><b class="vernacular">Cuckoo</b></span>, pair <span class="biota bird"><b class="vernacular">Kingfisher</b></span>, 2 <span class="biota bird"><b class="vernacular">Lesser Whitethroat</b></span> at north end of Reserve.</span></li>

<li class="hentry" id="D2011-05-05"><span class="entry-content"><abbr class="updated entry-title" title="2011-05-05">5th</abbr>: 3 <span class="biota bird"><b class="vernacular">Oystercatchers</b></span>, 1 <span class="biota bird"><b class="vernacular">Ringed Plover</b></span>.</span></li>

to make pages like this one: westmidlandbirdclub.com/belvide/latest.

That involves a series of find’n'replace operations, in sequence, like:

  • Find Oystercatcher and replace with <span class="biota bird"><b class="vernacular">Oystercatcher</b></span>
  • Find Ringed Plover and replace with <span class="biota bird"><b class="vernacular">Ringed Plover</b></span>
  • Find Little <span class="biota bird"><b class="vernacular">Ringed Plover</b></span> and replace with <span class="biota bird"><b class="vernacular">Little Ringed Plover</b></span>
  • Find </b></span>s and replace with s</b></span>

…and so on. With well over 100 species in a typical series of reports, that’s a lot of faffing about. And it has to be done every month. It’s a right pain in the Wheatear.

I need to find away to automate this (under Windows XP, preferably GUI-based), working from a saved list of find’n'replace terms, and would appreciate suggestions. Is there a text editor with a facility for sequencing such operations? I could learn to write code to do it, but that’s a heavy up-front investment. Or would someone like to volunteer to help me put the code together?

Update: I’ve found a solution in ReplaceText which, though it’s sadly no longer supported and apparently doesn’t work under Windows 7, does just what I need.

Image of Oystercatcher in flight at Els Ness, Sanday, Orkney, by lukaaash.

Bookmark and Share

About Andy Mabbett

Enjoying my new freelance career, helping organisations to understand on-line communities, open content, and related issues.
This entry was posted in annoyances, microformats, nature. Bookmark the permalink.

9 Responses to How can I automate repetitive find’n'replace operations?

  1. Dave Briggs says:

    Andy – don’t know how to do it, but my understanding is that this sort of text processing is what PERL was made for. Might be a new skill to learn!

  2. Ssaul Cozens says:

    Andy,

    couple of things you could look at. First this wikipedia page might be a useful reference http://en.wikipedia.org/wiki/Comparison_of_text_editors

    You might want to check for features like S&R history, which can save a lot of time. Or if you are willing to invest a bit more time, try and understand some basic regular expressions. These will help for the most formulaic operations. You’ll of course have to identify a text editor that does regexp based S&Rs across multiple files.

    Having said that, this is exactly the kind of stuff that the command line really works well on -particularly on *nix based OSes. Basically, you could write a single line that does one of your S&Rs across all files in a folder. Pop this line in to a text file, then you can run it by typing its filename. You can then add new S&Rs to this executable file to fit your needs.

    Take a look at this article by Gina Trapani http://lifehacker.com/354546/find-and-replace-text-with-fart – Yes the tool is called FART!

    I think the Windows still considers any file with a .bat extension to be a shell script, so once you have one command working pop it in a text file call replacebird.bat and run that.

    Hope this makes some kind of sense.

  3. Guy Chapman says:

    Notepad++ is free and does regex search and replace: http://notepad-plus-plus.org/ – handy tool for any HTML author to have around. Works on most flavours of Windows.

  4. Andy Mabbett says:

    Thanks, folks. There are some interesting leads for me to follow up, there.

    I’ve added a couple of lines, clarifying that I need to work from a saved list of find’n’replace terms, month after month.

  5. Tom Morris says:

    Yep, Perl will do it (or Ruby or any programming language) – Perl and Ruby are good for the job.

  6. Mike Cummins says:

    Pick a language that you feel would be useful and I will send you an annotated program in that language to do the job you require.

    I would recommend perl, php or python myself due to your web interest.

  7. Andy Mabbett says:

    Thanks to everyone for suggestions, and especially Mike who was all fired up to code something, but, as you can see in the update at the end of my post, I’ve found just what I need.

  8. Charles says:

    PHP is what I’d use. Or Applescript… though of course that’s not available to you. But it’s a fairly common regex (regular expressions) problem. PHP is exceptionally good at it. If you tied it into pumping the results into a database (MySQL is free) then you’d both have a record week by week, *and* you could automate the output, *and* if you changed how you want the results to look (for a redesign, say) all you change is the PHP code, while the inputs (bird details) remain the same.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>