Why there may be discrepancies in the assessment of scientific names
between the Catalog of Fishes and FishBase
By Nicolas
Bailly, FishBase Project Manager
WorldFish Center – FishBase Consortium – FishBase
Information and Research Group, Inc. (FIN)
Aquatic Biodiversity Informatics Office, Philippines
Fish-enthusiasts may be confused when they sometimes find contradictory information
in the Catalog of Fishes (CofF) and FishBase (FB). Independence and different
purposes of both initiatives explain this situation, which I discuss below
in more details from both theoretical and practical points of view. In conclusion,
I provide some numbers on known discrepancies and their low relative magnitude
over the total number of valid species. Finally, I suggest in the last paragraph
why and how discrepancies can be reported to FB.
Before going in deeper details, the FishBase Consortium recognizes:
- that CofF is an extremely valuable tool,
- that the work of its principal author, W.N. Eschmeyer, and collaborators,
is absolutely essential,
- that CofF is certainly the best database of its type and size, with accounts
on about 60,000 original names for 31,000 valid species, 11,000 over 5,000 for genera, gathered from 25,000 references, but to give the principle statistics,
- that FB would not be at its current advanced stage without CofF, and
- that within systematics in general, CofF gives a tremendous advantage to
fish biodiversity informatics compared to other groups of similar size.
The discussion below contains neither critique nor judgment on the quality
of both products or on the immense work achieved. The purpose of the text is
to give some background information and to clarify:
- that the two systems are different,
- that none is subordinated to the other,
- that in consequence, they deserve their correct respective recognition,
- why complete synchronization of the taxonomic and nomenclatural information is difficult, and
- how FB tries to minimize the remaining discrepancies.
So why may CofF and FB deliver contradictory information?
Independent initiatives
The first explanation is that CofF and FB are independent initiatives.
Bill Eschmeyer started CofF more than 25 years ago (*) when he was working in the California Academy of Sciences that supported and still supports the development and the maintainance of the database and the web site there. He is still continuing to update, complete
and correct data in almost real time with the help of several teams and many
colleagues around the world (see
CofF introduction).
Daniel Pauly and Rainer Froese started FB 20 years ago in ICLARM now called WorldFish Center. Today a team of
about 20 persons in various countries (mainly in the Philippines) maintains the
information system with the much appreciated help of many collaborators as
well, often the same as for CofF with respect to taxonomy, under the management
of the FishBase Consortium (nine institutions around the world). But note that
the FB team staffs try to synchronize the information with CofF as much as
they can.
CofF and FB collaborated closely on several projects in the past 15 years,
up to the inclusion of webpages with CofF content within the FB website: with
the improved web-presence of CofF, these pages were not needed anymore and were
deleted early 2008. This may have created the confusion that CofF and FB were
completely integrated, which is not the case.
Different purposes
The second explanation is that the nature and the purposes of the information
systems are different.
CofF is both a nomenclator (listing all the new published fish names and assessing
their availability and validity in the framework of the International Code
of Zoological Nomenclature) and an authoritative taxonomic list (giving the
current status of species validity and its current accepted name). A team did examine nearly all original descriptions, xeroxed many of them, proofed them, visited and copied type records from nearly all major museums resulting in a world type catalog that is invaluable for initiative like Fish-BoL), etc., settling publication dates and name spellings with much research.
FishBase, first and foremost on the basis of the taxonomic and nomenclatural backbone assembled by CofF, gives a wider range of information on current valid species. It is
not primarily a taxonomic database, but nonetheless a database on fish systematics
(as far as it is recognized that taxonomy and systematics are different, the
latter using the former as a structuring and indexing backbone for all biological,
ecological, etc. information on species).
Both databases are Biodiversity Information
Systems (BIS) and Global Species Databases (GSD).
Practically
- CofF is usually ahead in reporting new species because it records “only” the
name, the information on type specimens (catalogue number and reference,
locality, etc.), a general statement on the distribution and the water environment,
the name citation and status in subsequent references, and explanations on
taxonomic and nomenclatural difficulties. Few other categories of information
are available, e.g., on species numbers, museum collections, scientific journals,
author names, …
- Comparatively, FB manages more than 60 main tables on many topics, and
many more internal ones. Consequently, recording a new species involves more
work than in CofF.
- FB cites explicitly CofF when necessary, and/or solicit the users to check
directly in CofF for complex details on nomenclature, especially for names
with “ambiguous synonym” status.
- One time-consuming activity in FB is to merge two species after a synonymy
was established, but it is even more difficult to split one species in two.
In that case the team has to revisit every reference and decide which of the two
different species it belongs to, and this for up to 60 topics. Hence, FB
may have some delays for the treatment of such cases, because these operations
are much faster in CofF. In order to compensate this problem, we have recently
integrated a taxonomic warning at the top of the species summary page with
statements on current synonymies or invalidity of the species when it is
needed (this is not complete yet however).
- In some cases, FB treats a species as valid according to a published work
while CofF considers it as uncertain with no current valid name indicated:
CofF may have decided that the work is not complete enough to follow the
author’s conclusions. In contrast, we think it is better to present
the biological and ecological information related to this uncertain species
so the case can be more easily worked out by colleagues or FB users.
- There are still several thousand original names in CofF that have no current
status. Usually, these names are not in FB because in its database structure,
it is not possible to record names that are not associated with a current
valid species. But generally, they are old names not used anymore for a long
time or only in long-lasting museum collections.
- There are cases where the gender agreement of the specific epithet with
the generic noun is not well established (note that the nouns in apposition
as specific epithets are nightmares!). In general, FB follows CofF decisions
on gender, but there may be a time-lag. The same is true for the year of
publication.
- The problem of subspecies may cause some troubles. In FB, when subspecies
are recognized as valid, we don’t have a species record and a nominotypical
record. For example, we use Cyprinus carpio carpio as the current
accepted name and mark Cyprinus carpio as a “synonym” (this is
an effect of the database structure), although formally it is a valid name
at species rank. On March 31st, 2010, there were 185 species with 271
valid subspecies in FB. However, FB considers that the subspecies rank should not
be used anymore in fishes due to its rarity (271 over 31,000 valid species-group
taxa!), and that if subspecies are really well circumscribed, they should
be elevated to species rank.
- The suprageneric classifications are almost the same, but FB retains some
proposals of Joe Nelson in his Fishes
of the World, especially for subfamilies. It has a feeble impact since
CofF follows also Nelson’s classification for the largest part, and
since it concerns mostly a rank issue (e.g., families are considered as subfamilies
or the reverse).
- There are recurrent mistakes peculiar to each database. One type of error in CofF is
when a different combination is allocated to a valid species, the related
synonyms may not be updated at the same time. In FB, the subfamily assignment
may be inconsistent across species of the same genus. In the two cases, errors
are caused by the respective structure of the databases: Typos are also unavoidable
in both databases.
- The structures of the databases are different: where FB keeps the different
forms of names (“names-as-strings”: original names, new combinations,
misspellings, etc.) in different records (and linked to a citation table),
CofF stores all citations in a text field of the same record for the original
name, which is perfect for full text indexing but much less usable for focused
queries and production of lists. In some complex nomenclatural cases, it
may provoke false but apparent contradictions.
- FB links to CofF from the species and synonym summary pages using deep-links:
few may be erroneous for various technical reasons.
- Finally, both CofF and FB rely on the management of human networks, and
unavoidably, we both have to live with contradictory publications and opinions.
Until the taxonomic community finds a way to deliver a consensual knowledge
to the rest of the society, biodiversity information systems like CofF and
FB will be subject to discrepancies. However, the percentage of such issues
remains relatively low, and the impact minimal.
Low relative magnitude of discrepancies
One important fact is that in both databases, there are less than 5% of errors.
For sure, multiplied by 31,000 or so, it still makes ca. 1,500 records, and
no user is happy finding errors. However, note that there is already a
complete correspondence on species validity, their names, the author and the
year of publication for 94% of valid names between CofF and FB. Different spellings
(gender agreement, author names) and years of publication account for 1%, and
different combinations for an additional 2.5%. So all in all there is only
2.5% difference in terms of species validity, not including the delays for
encoding new species in FB compared to CofF (between 300 and 500 per year in
the past 12 years).
How to report discrepancies to FB
We realize that the fact that CofF and FB provide consistent information on
species validity and current accepted names is an important issue from a user
perspective. Please bear with us, and help us to maintain this consistency.
If you find discrepancies, we suggest that you warn FB first, as the probability
is a bit higher that FB did not include an update yet, so CofF specialists
are not mobilized for an erroneous reason. Please use the “Comments and
corrections” page from the Species summary page concerned (Top menu “Feedback”,
or bottom of the page). We will make a priority to check and revise the issue
for the next update (now done every two months). Moreover, we will report to
CofF if we find that it is more likely that the revision is to be made in CofF.
Many thanks to Rainer Froese (IFM-GEOMAR), Emily Capuli (WorldFish Center), Gert Boden (MRAC) for reviewing and improving the text, and to Bill Eschmeyer (formerly CAS) for historical corrections on version 1.
Web published version no. 2 (May 6th, 2010)
Previous versions
- Web published version no. 1 (April 1st, 2010)
- Abridged version published in English and French by the Société Française d'Ichtyologie in SFInfos n°53.
- (*) Note that I mistakenly thought and wrote "40 years ago" in the version 1 and its abridged version.