I’m the webmaster (and a trustee) of the West Midland Bird Club, a registered charity.
Every month, I get sent a series of text files, with lists of bird sightings at each of our reserves, and some other locations. They usually comprise around thirty entries like these:
- 6th: 1 Dunlin, 1 Oystercatcher, 2 Little Ringed Plovers, 1 Common Sandpiper, pair Shelduck, pair Greylag Geese flew over, 1 Cuckoo, pair Kingfisher, 2 Lesser Whitethroat at north end of Reserve.
- 5th: 3 Oystercatchers, 1 Ringed Plover.
and I need to turn them into HTML markup like this:
<li class="hentry" id="D2011-05-06"><span class="entry-content"><abbr class="updated entry-title" title="2011-05-06">6th</abbr>: 1 <span class="biota bird"><b class="vernacular">Dunlin</b></span>, 1 <span class="biota bird"><b class="vernacular">Oystercatcher</b></span>, 2 <span class="biota bird"><b class="vernacular">Little Ringed Plover</b></span>, 1 <span class="biota bird"><b class="vernacular">Common Sandpiper</b></span>, pair <span class="biota bird"><b class="vernacular">Shelduck</b></span>, pair <span class="biota bird"><b class="vernacular">Greylag Geese</b></span> flew over, 1 <span class="biota bird"><b class="vernacular">Cuckoo</b></span>, pair <span class="biota bird"><b class="vernacular">Kingfisher</b></span>, 2 <span class="biota bird"><b class="vernacular">Lesser Whitethroat</b></span> at north end of Reserve.</span></li>
<li class="hentry" id="D2011-05-05"><span class="entry-content"><abbr class="updated entry-title" title="2011-05-05">5th</abbr>: 3 <span class="biota bird"><b class="vernacular">Oystercatchers</b></span>, 1 <span class="biota bird"><b class="vernacular">Ringed Plover</b></span>.</span></li>
to make pages like this one: westmidlandbirdclub.com/belvide/latest.
That involves a series of find’n’replace operations, in sequence, like:
- Find
Oystercatcher
and replace with<span class="biota bird"><b class="vernacular">Oystercatcher</b></span>
- Find
Ringed Plover
and replace with<span class="biota bird"><b class="vernacular">Ringed Plover</b></span>
- Find
Little <span class="biota bird"><b class="vernacular">Ringed Plover</b></span>
and replace with<span class="biota bird"><b class="vernacular">Little Ringed Plover</b></span>
- Find
</b></span>s
and replace withs</b></span>
…and so on. With well over 100 species in a typical series of reports, that’s a lot of faffing about. And it has to be done every month. It’s a right pain in the Wheatear.
I need to find away to automate this (under Windows XP, preferably GUI-based), working from a saved list of find’n’replace terms, and would appreciate suggestions. Is there a text editor with a facility for sequencing such operations? I could learn to write code to do it, but that’s a heavy up-front investment. Or would someone like to volunteer to help me put the code together?
Update: I’ve found a solution in ReplaceText which, though it’s sadly no longer supported and apparently doesn’t work under Windows 7, does just what I need.
Image of Oystercatcher in flight at Els Ness, Sanday, Orkney, by lukaaash.
Andy,
couple of things you could look at. First this wikipedia page might be a useful reference http://en.wikipedia.org/wiki/Comparison_of_text_editors
You might want to check for features like S&R history, which can save a lot of time. Or if you are willing to invest a bit more time, try and understand some basic regular expressions. These will help for the most formulaic operations. You’ll of course have to identify a text editor that does regexp based S&Rs across multiple files.
Having said that, this is exactly the kind of stuff that the command line really works well on -particularly on *nix based OSes. Basically, you could write a single line that does one of your S&Rs across all files in a folder. Pop this line in to a text file, then you can run it by typing its filename. You can then add new S&Rs to this executable file to fit your needs.
Take a look at this article by Gina Trapani http://lifehacker.com/354546/find-and-replace-text-with-fart – Yes the tool is called FART!
I think the Windows still considers any file with a .bat extension to be a shell script, so once you have one command working pop it in a text file call replacebird.bat and run that.
Hope this makes some kind of sense.
also:
http://www.programmersheaven.com/download/41236/download.aspx
Thanks, folks. There are some interesting leads for me to follow up, there.
I’ve added a couple of lines, clarifying that I need to work from a saved list of find’n’replace terms, month after month.
Andy – don’t know how to do it, but my understanding is that this sort of text processing is what PERL was made for. Might be a new skill to learn!
Yep, Perl will do it (or Ruby or any programming language) – Perl and Ruby are good for the job.
Notepad++ is free and does regex search and replace: http://notepad-plus-plus.org/ – handy tool for any HTML author to have around. Works on most flavours of Windows.
Pick a language that you feel would be useful and I will send you an annotated program in that language to do the job you require.
I would recommend perl, php or python myself due to your web interest.
Thanks to everyone for suggestions, and especially Mike who was all fired up to code something, but, as you can see in the update at the end of my post, I’ve found just what I need.
PHP is what I’d use. Or Applescript… though of course that’s not available to you. But it’s a fairly common regex (regular expressions) problem. PHP is exceptionally good at it. If you tied it into pumping the results into a database (MySQL is free) then you’d both have a record week by week, *and* you could automate the output, *and* if you changed how you want the results to look (for a redesign, say) all you change is the PHP code, while the inputs (bird details) remain the same.