How can I automate repetitive find’n’replace operations?

I’m the webmaster (and a trustee) of the West Midland Bird Club, a registered charity.

Oystercatcher

Every month, I get sent a series of text files, with lists of bird sightings at each of our reserves, and some other locations. They usually comprise around thirty entries like these:

  • 6th: 1 Dunlin, 1 Oystercatcher, 2 Little Ringed Plovers, 1 Common Sandpiper, pair Shelduck, pair Greylag Geese flew over, 1 Cuckoo, pair Kingfisher, 2 Lesser Whitethroat at north end of Reserve.
  • 5th: 3 Oystercatchers, 1 Ringed Plover.

and I need to turn them into HTML markup like this:

<li class="hentry" id="D2011-05-06"><span class="entry-content"><abbr class="updated entry-title" title="2011-05-06">6th</abbr>: 1 <span class="biota bird"><b class="vernacular">Dunlin</b></span>, 1 <span class="biota bird"><b class="vernacular">Oystercatcher</b></span>, 2 <span class="biota bird"><b class="vernacular">Little Ringed Plover</b></span>, 1 <span class="biota bird"><b class="vernacular">Common Sandpiper</b></span>, pair <span class="biota bird"><b class="vernacular">Shelduck</b></span>, pair <span class="biota bird"><b class="vernacular">Greylag Geese</b></span> flew over, 1 <span class="biota bird"><b class="vernacular">Cuckoo</b></span>, pair <span class="biota bird"><b class="vernacular">Kingfisher</b></span>, 2 <span class="biota bird"><b class="vernacular">Lesser Whitethroat</b></span> at north end of Reserve.</span></li>

<li class="hentry" id="D2011-05-05"><span class="entry-content"><abbr class="updated entry-title" title="2011-05-05">5th</abbr>: 3 <span class="biota bird"><b class="vernacular">Oystercatchers</b></span>, 1 <span class="biota bird"><b class="vernacular">Ringed Plover</b></span>.</span></li>

to make pages like this one: westmidlandbirdclub.com/belvide/latest.

That involves a series of find’n’replace operations, in sequence, like:

  • Find Oystercatcher and replace with <span class="biota bird"><b class="vernacular">Oystercatcher</b></span>
  • Find Ringed Plover and replace with <span class="biota bird"><b class="vernacular">Ringed Plover</b></span>
  • Find Little <span class="biota bird"><b class="vernacular">Ringed Plover</b></span> and replace with <span class="biota bird"><b class="vernacular">Little Ringed Plover</b></span>
  • Find </b></span>s and replace with s</b></span>

…and so on. With well over 100 species in a typical series of reports, that’s a lot of faffing about. And it has to be done every month. It’s a right pain in the Wheatear.

I need to find away to automate this (under Windows XP, preferably GUI-based), working from a saved list of find’n’replace terms, and would appreciate suggestions. Is there a text editor with a facility for sequencing such operations? I could learn to write code to do it, but that’s a heavy up-front investment. Or would someone like to volunteer to help me put the code together?

Update: I’ve found a solution in ReplaceText which, though it’s sadly no longer supported and apparently doesn’t work under Windows 7, does just what I need.

Image of Oystercatcher in flight at Els Ness, Sanday, Orkney, by lukaaash.

9 thoughts on “How can I automate repetitive find’n’replace operations?

  1. Ssaul Cozens

    Andy,

    couple of things you could look at. First this wikipedia page might be a useful reference http://en.wikipedia.org/wiki/Comparison_of_text_editors

    You might want to check for features like S&R history, which can save a lot of time. Or if you are willing to invest a bit more time, try and understand some basic regular expressions. These will help for the most formulaic operations. You’ll of course have to identify a text editor that does regexp based S&Rs across multiple files.

    Having said that, this is exactly the kind of stuff that the command line really works well on -particularly on *nix based OSes. Basically, you could write a single line that does one of your S&Rs across all files in a folder. Pop this line in to a text file, then you can run it by typing its filename. You can then add new S&Rs to this executable file to fit your needs.

    Take a look at this article by Gina Trapani http://lifehacker.com/354546/find-and-replace-text-with-fart – Yes the tool is called FART!

    I think the Windows still considers any file with a .bat extension to be a shell script, so once you have one command working pop it in a text file call replacebird.bat and run that.

    Hope this makes some kind of sense.

    Reply
  2. Dave Briggs

    Andy – don’t know how to do it, but my understanding is that this sort of text processing is what PERL was made for. Might be a new skill to learn!

    Reply
  3. Mike Cummins

    Pick a language that you feel would be useful and I will send you an annotated program in that language to do the job you require.

    I would recommend perl, php or python myself due to your web interest.

    Reply
  4. Charles

    PHP is what I’d use. Or Applescript… though of course that’s not available to you. But it’s a fairly common regex (regular expressions) problem. PHP is exceptionally good at it. If you tied it into pumping the results into a database (MySQL is free) then you’d both have a record week by week, *and* you could automate the output, *and* if you changed how you want the results to look (for a redesign, say) all you change is the PHP code, while the inputs (bird details) remain the same.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *


e.g. 0000-0002-7299-680X