Monday 28 December 2020

Accessing birding data for Mallacoota

From my experience the best data set to use for longer term broad-scale information about the birds in an area is eBird.  The Birdlife Australia Birdata application has some benefits, but is far less popular so I will discuss that later.  I will also conclude with some personal thoughts comparing eBird with Birdata.

Use of Ebird 

There are several ways of getting at the data of which the most authoritative is the final data which can be downloaded (at no cost).  There are a few inconveniences about this process:

  1. The smallest geographic unit in Australia for which data can be downloaded is the LGA level (in our case East Gippsland.  The file of all records is about 25 Mb which can make an impact on download limits, and a file of records for the past 6 months is 0.5 - 1.0 Mb.
  2. East Gippsland is about 8,000 sq miles in area (bigger than 6 States of the US) so I have to subset the data to make it relevant to Mallacoota.  (It would be reely reely good if eBird had a polygon tool as does Birdata (see below). but this seems to not to be realised by the mega-brains at Cornell who need to get out and talk to their users a little more.)
  3. The names used by eBird Central (at Cornell U, in upstate New York) use "American" names not those in the Australian portal to eBird used for data entry .  For example:
    1. You'd think a country that can put a man on the moon could spell 'Grey' properly.  
    2. You'd think that Australians should say whether Chenonetta jubatta is called Australian Wood Duck or Maned Duck.  
  4. Nope.  So every update requires a couple of hours fiddling to get it consistent with existing data downloaded and cleaned for previous months.
  5. The  data for a month doesn't become available until the 15th of the following month.  This allows time for records to be submitted but is a cramp when trying to produce timely output.
However it is possible to do what is needed with some fiddling around.  

My observations

My starting point is to get a list of what I have seen in the area.  This used to generate a list of sites visited during a period from which those out of the area of interest could be deleted.  However eBird was then "improved" so that the the list of birds seen could only be limited to an LGA: see point 2 above.  I then delete birds listed only seen elsewhere in the shire (eg Cann River; Cape Conran).  I can then use the list of species I have seen. By way of example this gave a basic list of 94 species for December 2020.

Another inconvenience rears its head in that the name of species appears in the form "Common Name - scientific name".  The problem is that most of my information is stored under common name only.  So the next step is to split the name field.  As the EXCEL facility Data>Text to Columns only works on a single character (and splitting on '-' causes havoc as many common names use hyphens eg Black-faced Cuckooshrike) the requires a further step to replace ' - ' with '#'.  Then split the name on the character '#'.

I then import the list of names to my ACCESS database as a table with a name in the form "t <date> MB <month>".  My next step is to add to that table any incidental records I have accumulated from reports to the Mallacoota Birds FB group, conversations with residents, eBird rare bird alerts etc.  This step added about 10 species in December 2020.

As an aside, if the rare bird report alerts me to the presence of someone I consider to be an expert birder visiting the area I will often look at the checklist concerned which can give rise to several other additional species.  No names. no pack drill!

Other Observations x Hotspot

My next step is to go to the eBird "Explore>Explore Hotspots" page and enter the word Mallacoota.  That gives a list with 20 entries of which I start by selecting "Mallacoota, East Gippsland, AU-VIC".  That is a generic site of little use for detailed studies but given my aim is to generate a list for a fairly large area it is fit for purpose, especially as many visitors use the hotspot for all their observations during a short stay.  I typically scan the list of species shown within the month of interest (filtered as part of the eBird process) and note the names of species not previously included in my list.  I then append the new names to the table "<date> MB <month>".

I repeat this for other popular hotspots: this usually includes Captain Stevenson's Point, Bastion Point, Betka Beach, Wastewater Treatment Plant, Gipsy Point; Howe Flat.  This added about 20 species.

Other Observations  x Species

The first step in this is to compile a list of the species seen for the study month in other years, using my ACCESS database.  I then use an ACCESS query wizard to identify the species seen previously in the month and not for this year (which I refer to as MIAs (Missing in Action).  In doing this I score the number previous reports of each of the species to prioritise remaining investigations.

I then work my way down the list of MIA sorted by number of previous records.  The next step is to use the eBird Explore>Species Maps facility.  I enter a species name and zoom the map to show my area of interest around Mallacoota.  If the species has been seen in the last month an orange pin is shown and I add the name to the table.  If no orange appears I move to the next species.

The process is rather time consuming and thus I tend to desist when I have had several species which continue as MIA.  In December 2020 this step added about 20 species to the list.

Use of Birdata

This application has improved dramatically in recent years.  When first launched it appeared to be only useful for repeated scientific-quality surveys of limited appeal to recreational birders (or indeed professional birders working more widely).  Thus most birders took up eBird and, having put in the effort to get their life lists into eBird, and become familiar with using the app in the field, continue to use it.  However some folk do use Birdata to report for the Mallacoota region so I scan the output for the application each month to see if additional species have been reported.  There are 3 major advantages for Birdata:
  1. The names used are those commonly used in Australia and are in most cases the official common names for the species; and
  2. It is possible to download data for a custom polygon which is a massive benefit for analytical purposes; and
  3. Once reported, the data is available more or less instantly across the whole system (rather than having to wait 15 days as with eBird)
The access path to Birdata output may not be familiar to readers so here is an annotated screenshot of how I get at, and use, it.
The yellow arrows point to the 5 sites for which data has been reported in the month.  Note the polygon I have drawn (1) which is close enough to the same as my definition of the Mallacoota area I use in eBird.  Dates are set at (2) to filter results for the month of concern.  I have circled (3) the observation of Painted Button-quail (in eBird, Painted Buttonquail) as that is not included in the eBird output.  (As an aside there are only 3 reports of Painted Buttonquail in all eBird data for the area, none in December.)  So I have added Painted Buttonquail to my table for the monthly report.

Comparison of EBird and Birdata

I stress that what follows is my personal views which, with $5, will get you a cup of nice coffee at Lucies'.  It is my strong belief that the two organisations should work harder to exchange data, since I believe that considerable useful information is lost to researchers under the current arrangements.

I also stress that what follows is written from a perspective of data utility for reporting birds seen each month in the Mallacoota area.  As Birdata is (quite reasonably) restricted to Australia it will have less appeal to those who - if the national borders ever open - travel for birding and have world-wide lists.

The advantage of eBird is well illustrated by elements in the graphic above.
  1. The Birdata report covers 5 locations in the area.  I have personally submitted reports for 16 areas in December 2020 and know that others have reported for many more areas.  However data for the area as a whole is not yet available (see point 5 under inconveniences above).  In December 2019, which is probably broadly comparable in terms of birding effort, 41 locations were birded (a few of these - at a glance 4 names affecting 2 localities - are different only in the detail of the locality name (eg Pebble Beach is obviously a typo for Pebbly Beach) so perhaps 39 localities is a better measure).
  2. The Birdata report covers 52 species.  At the time of writing eBird has 157 species for the area.  
I mentioned some advantages of Birdata above.  A further point is that Birdata is focused on Australia and can be influenced by Australian Birders.  In contrast there are 614,000 eBirders worldwide as against 13,000 in Australia: which is the dog and which the tail is clear.

Some supporters of Birdata complain that eBird data is of lower quality as "anyone can report anything".  I do not accept that view since:
  • eBird data is routinely reviewed to the extent sensible by very experienced moderators familiar with the areas involved;
  • filters to identify likely errors are set at quite tight levels in eBird and reviewed as often as possible (of course, the difficulty with making a system idiot-proof is that 'they' keep making bigger idiots); and 
  • I have seen extremely dodgy reports enter, and remain for some time in, the Birdata dataset.  By way of example a report of Black Currawong (a Tasmanian endemic) was shown against Central Victoria for several years.
A further criticism of eBird is that the much used travelling and incidental protocols in eBird are too loose for scientific analysis.  Again I do not accept this since:
  • the incidental protocol is clearly identified and if wished those records can be filtered out of an analysis;
  • Many of the travelling protocols must by definition fit within Birdata protocols.  For example any eBird travelling record with a distance >1km and < 5km must fit within a Birdata 5km radius protocol.
From my view it is difficult, without invoking attractive conspiracy theories, to understand why the two systems cannot talk to each other.  It would be extremely beneficial for birders, analysts and birds if the two groups could work together on data exchange so that the benefits of both systems became available through a common interface.  



No comments:

Post a Comment

Comments are welcome but if I decide they are spam or otherwise inappropriate they will not be approved.