Genetic resources in safe hands

Among the most important—and most used—collections of plant genetic resources for food and agriculture (PGRFA) are those maintained by eleven of the fifteen international agricultural research centers¹ funded through the Consultative Group on International Agricultural Research (CGIAR). Not only are the centers key players in delivering many of the 17 Sustainable Development Goals (SDGs) adopted by the United Nations in 2015, but their germplasm collections are the genetic base of food security worldwide.

Over decades these centers have collected and carefully conserved their germplasm collections, placing them under the auspices of the Food and Agriculture Organization (FAO), and now, the importance of the PGRFA held by CGIAR genebanks is enshrined in international law, through agreements between CGIAR Centers and the International Treaty on Plant Genetic Resources for Food and Agriculture (ITPGRFA)². These agreements oblige CGIAR genebanks to make collections and data available under the terms of the ITPGRFA and to manage their collections following the highest standards of operation.

Evaluation and use of the cultivated and wild species in these large collections have led to the development of many new crop varieties, increases in agricultural productivity, and improvements in the livelihoods of millions upon millions of farmers and poor people worldwide. The genomic dissection of so many crops is further enhancing access to these valuable resources.

The CGIAR genebanks
In the Americas, CIP in Peru, CIAT in Colombia, and CIMMYT in Mexico hold important germplasm collections of: potatoes, sweet potatoes and other Andean roots and tubers; of beans, cassava, and tropical forages; and maize and wheat, respectively. And all these collections have serious representation of the closest wild species relatives of these important crops.

In Africa, there are genebanks at Africa Rice in Côte d’Ivoire, IITA in Nigeria, ILRI in Ethiopia, and World Agroforestry in Kenya, holdings collections of: rice; cowpea and yams; tropical forage species; and a range of forest fruit and tree species, respectively.

ICARDA had to abandon its headquarters in Aleppo in northern Syria, and has recently relocated to two sites in Morocco and Lebanon.

ICRISAT in India and IRRI in the Philippines have two of the largest genebank collections, of: sorghum, millets, and pigeon pea; and rice and its wild relatives.

There is just one CGIAR genebank in Europe, for bananas and plantains, maintained by Bioversity International (that has its headquarters in Rome) at the University of Leuven in Belgium.

Genebank security
Today, the future of these genebanks is brighter than for many years. Since 2012 they received ‘secure’ funding through the Genebanks CGIAR Research [Support] Program or Genebanks CRP, a collaboration with and funding from the Crop Trust. It was this Genebanks CRP that I and my colleagues Brian Ford-Lloyd and Marisé Borja evaluated during 2016/17. You may read our final evaluation report here. Other background documents and responses to the evaluation can be found on the Independent Evaluation Arrangement website. The CRP was superseded by the Genebank Platform at the beginning of 2017.

As part of the evaluation of the Genebanks CRP, Brian Ford-Lloyd and I attended the Annual Genebanks Meeting in Australia in November 2016, hosted by the Australian Grains Genebank at Horsham, Victoria.

While giving the Genebanks CRP a favorable evaluation—it has undoubtedly enhanced the security of the genebank collections in many ways—we did call attention to the limited public awareness about the CGIAR genebanks among the wider international genetic conservation community. And although the Platform has a website (as yet with some incomplete information), it seems to me that the program is less proactive with its public awareness than under the CGIAR’s System-wide Genetic Resources Program (SGRP) more than a decade ago. Even the folks we interviewed at FAO during our evaluation of the Genebanks CRP indicated that this aspect was weaker under the CRP than the SGRP, to the detriment of the CGIAR.

Now, don’t get me wrong. I’m not advocating any return to the pre-CRP or Platform days or organisation. However, the SGRP and its Inter-Center Working Group on Genetic Resources (ICWG-GR) were the strong foundations on which subsequent efforts have been built.

The ICWG-GR
When I re-joined the CGIAR in July 1991, taking charge of the International Rice Genebank at IRRI, I became a member of the Inter-Center Working Group on Plant Genetic Resources (ICWG-PGR), but didn’t attend my first meeting until January 1993. I don’t think there was one in 1992, but if there was, I was not aware of it.

We met at the campus of the International Livestock Centre for Africa (ILCA)³ in Addis Ababa, Ethiopia. It was my first visit to any African country, and I do remember that on the day of arrival, after having had a BBQ lunch and a beer or three, I went for a nap to get over my jet-lag, and woke up 14 hours later!

I’m not sure if all genebanks were represented at that ILCA meeting. Certainly genebank managers from IRRI, CIMMYT, IITA, CIP, ILCA, IPGRI (the International Plant Genetic Resources Institute, now Bioversity International) attended, but maybe there were more. I was elected Chair of the ICWG-PGR as it was then, for three years. These were important years. The Convention on Biological Diversity had been agreed during June 1992 Earth Summit in Rio de Janeiro, and was expected to come into force later in 1993. The CGIAR was just beginning to assess how that would impact on its access to, and exchange and use of genetic resources.

L-R: Brigitte Maass (CIAT), Geoff Hawtin (IPGRI), ??, Ali Golmirzaie (CIP), Jan Valkoun (ICARDA), ??, ??, Masa Iwanaga (IPGRI), Roger Rowe (CIMMYT), ?? (ICRAF), Melak Mengesha (ICRISAT), Mike Jackson (IRRI), Murthi Anishetty (FAO), Quat Ng (IITA), Jean Hanson (ILCA), Jan Engels (IPGRI).

We met annually, and tried to visit a different center and its genebank each year. In 1994, however, the focus was on strengthening the conservation efforts in the CGIAR, and providing better corrdination to these across the system of centers. The SGRP was born, and the remit of the ICWG-PGR (as the technical committee of the program) was broadened to include non-plant genetic resources, bringing into the program not only ICLARM (the International Centre for Living Aquatic Resources Management, now WorldFish, but at that time based in Manila), the food policy institute, IFPRI in Washington DC, the forestry center, CIFOR in Indonesia, and ICRAF (the International Centre for Research on Agro-Forestry, now World Agroforestry) in Nairobi. The ICWG-PGR morphed into the ICWG-GR to reflect this broadened scope.

Here are a few photos taken during our annual meetings in IITA, at ICRAF (meetings were held at a lodge near Mt. Kenya), and at CIP where we had opportunity of visiting the field genebanks for potatoes and Andean roots and tubers at Huancayo, 3100 m, in central Peru.

The System-wide Genetic Resources Program
The formation of the SGRP was an outcome of a review of the CGIAR’s genebank system in 1994. It became the only program of the CGIAR in which all 16 centers at that time (ISNAR, the International Services for National Agricultural Research, based in The Hague, Netherlands closed its doors in March 2004) participated, bringing in trees and fish, agricultural systems where different types of germplasm should be deployed, and various policy aspects of germplasm conservation costs, intellectual property, and use.

In 1995 the health of the genebanks was assessed in another review, and recommendations made to upgrade infrastructure and techical guidelines and procedures. In our evaluation of the Genebanks CRP in 2016/17 some of these had only recently been addressed once the secure funding through the CRP had provided centers with sufficient external support.

SGRP and the ICWG-GR were major players at the FAO International Technical Conference on Plant Genetic Resources held in Leipzig in 1997.

Under the auspices of the SGRP two important books were published in 1997 and 2004 respectively. The first, Biodiversity in Trust, written by 69 genebank managers, plant breeders and others working with germplasm in the CGIAR centers, and documenting the conservation and use status of 21 species or groups of species, was an important assessment of the status of the CGIAR genebank collections and their use, an important contribution not only in the context of the Convention on Biological Diversity, but also as a contribution to FAO’s own monitoring of PGRFA that eventually led to the International Treaty in 2004.

The second, Saving Seeds, was a joint publication of IFPRI and the SGRP, and was the first comprehensive study to calculate the real costs of conserving seed collections of crop genetic resources. Costing the genebanks still bedevils the CGIAR, and it still has not been possible to arrive at a costing system that reflects both the heterogeneity of conservation approaches and how the different centers operate in their home countries, their organizational structures, and different costs basis. One model does not fit all.

In 1996/97 I’d been impressed by some research from the John Innes Institute in the UK about gene ‘homology’ or synteny among different cereal crops. I started developing some ideas about how this might be applied to the evaluation of genebank collections. In 1998, the ICWG-GR gave me the go-ahead—and a healthy budget— to organize an international workshop on Genebanks and Comparative Genetics that I’d been planning. With the help of Joel Cohen at ISNAR, we held a workshop there in ISNAR in August 1999, and to which we invited all the genebank managers, staff working at the centers on germplasm, and many of the leading lights from around the world in crop molecular biology and genomics, a total of more than 50 participants.

This was a pioneer event for the CGIAR, and certainly the CGIAR genebank community was way ahead of others in the centers in thinking through the possibilities for genomics, comparative genetics, and bioinformatics for crop improvement. Click here to read a summary of the workshop findings published in the SGRP Annual Report for 1999.

The workshop was also highlighted in Promethean Science, a 41 page position paper published in 2000 on the the importance of agricultural biotechnology, authored by former CGIAR Chair and World Bank Vice-President Ismail Serageldin and Gabrielle Persley, a senior strategic science leader who has worked with some of the world’s leading agricultural research and development agencies. They address address the importance of characterizing biodiversity (and the workshop) in pages 21-23.

Although there was limited uptake of the findings from the workshop by individual centers (at IRRI for instance, breeders and molecular biologists certainly gave the impression that us genebankers has strayed into their turf, trodden on their toes so-to-speak, even though they had been invited to the workshop but not chosen to attend), the CGIAR had, within a year or so, taken on board some of the findings from the workshop, and developed a collective vision related to genomics and bioinformatics. Thus, the Generation Challenge Program (GCP) was launched, addressing many of the topics and findings that were covered by our workshop. So our SGRP/ICWG-GR effort was not in vain. In fact, one of the workshop participants, Bob Zeigler, became the first director of the GCP. Bob had been a head of one of IRRI’s research programs from 1992 until he left in about 1998 to become chair of the Department of Plant Pathology at Kansas State University. He returned to IRRI in 2004 as Director General!

Moving forward
Now the Genebanks CRP has been superseded by the Genebank Platform since the beginning of the year. The genebanks have certainly benefited from the secure funding that, after many years of dithering, the CGIAR finally allocated. The additional and external support from the Crop Trust has been the essential element to enable the genebanks to move forward.

In terms of data management, Genesys has gone way beyond the SGRP’s SINGER data management system, and now includes data on almost 3,602,000 accessions held in 434 institutes. Recently, DOIs have been added to more than 180,000 of these accessions.

One of the gems of the Genebanks CRP, which continues in the Genebank Platform, is delivery and implementation of a Quality Management System (QMS), which has two overarching objectives. QMS defines the necessary activities to ensure that genebanks meet all policy and technical standards and outlines ways to achieve continual quality improvement in the genebank’s administrative, technical and operational performance. As a result, it allows genebank users, regulatory bodies and donors to recognize and confirm the competence, effectiveness and efficiency of Platform genebanks.

The QMS applies to all genebank operations, staff capacity and succession, infrastructure and work environments, equipment, information technology and data management, user satisfaction, risk management and operational policies.

The Platform has again drawn in the policy elements of germplasm conservation and use, as it used to be under the SGRP (but ‘ignored’ under the Genebanks CRP), and equally importantly, the essential elements of germplasm health and exchange, to ensure the safe transfer of germplasm around the world.

Yes, I believe that as far as the CGIAR genebanks are concerned, genetic resources are in safe(r) hands. I cannot speak for genebanks elsewhere, although many are also maintained to a high standard. Unfortunately that’s not always the case, and I do sometimes wonder if there are simply too many genebanks or germplasm collections for their own good.

But that’s the stuff of another blog post once I’ve thought through all the implications of the various threads that are tangled in my mind right now.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

¹ Research centers of the CGIAR (* genebank)

  • International Potato Center (CIP), Lima, Peru*
  • International Center for Tropical Agriculture (CIAT), Cali, Colombia*
  • International Center for Maize and Wheat Improvement (CIMMYT), Texcoco, nr. Mexico DF, Mexico*
  • Bioversity International, Rome, Italy*
  • International Center for Research in the Dry Areas (ICARDA), Lebanon and Morocco*
  • AfricaRice (WARDA), Bouaké / Abidjan, Côte d’Ivoire*
  • International Institute for Tropical Agriculture (IITA), Ibadan, Nigeria*
  • International Livestock Research Institute (ILRI), Addis Ababa, Ethiopia and Nairobi, Kenya*
  • World Agroforestry Centre (WARDA), Nairobi, Kenya*
  • International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India*
  • International Rice Research Institute (IRRI), Los Baños, Philippines*
  • Center for International Forestry Research (CIFOR), Bogor, Indonesia
  • WorldFish, Penang, Malaysia
  • International Water Management Institute (IWMI), Colombo, Sri Lanka
  • International Food Policy Research Institute (IFPRI), Washington, DC, USA

² The objectives of the International Treaty on Plant Genetic Resources for Food and Agriculture are the conservation and sustainable use of all plant genetic resources for food and agriculture and the fair and equitable sharing of the benefits arising out of their use, in harmony with the Convention on Biological Diversity, for sustainable agriculture and food security.

³ ILCA was merged in January 1995 with the International Laboratory for Research on Animal Diseases, based in Nairobi, Kenya, to form the International Livestock Research Institute (ILRI) with two campuses in Nairobi and Addis Ababa. The forages genebank is located at the Addis campus. A new genebank building was opened earlier this year.

When is white not white? When it’s green, of course.

Or maybe another color altogether. Then again, I could ask when tall is actually short, or a whole host of apparently contradictory questions.

What a conundrum.

No, this isn’t some fiction. It was the reality I faced when I took up the reins as head of IRRI’s Genetic Resources Center (GRC) in July 1991 and asked for a demonstration of the ‘genebank data management system’.

free-images-for-websites-computer_clipart1A large germplasm collection, or was it?
The International Rice Genebank (IRG) at IRRI holds the world’s largest and (almost certainly) the most genetically diverse collection of rice varieties of Asian rice (Oryza sativa), African rice (O. glaberrima) and wild species of rice (not only Oryza species, but representatives from related genera).

Besides providing the very best conditions to ensure the long-term survival of these precious seed samples (as I blogged about recently), it’s also essential to document, curate, and easily retrieve information about the germplasm stored in the genebank. That’s quite a daunting prospect, especially for a collection as large as the International Rice Genebank Collection (IRGC), with over 126,600 samples or accessions at the last count¹. (During my tenure as head of GRC, the collection actually grew by about 25% or so, with funding for germplasm collecting from the Swiss government.)

I discovered that the three rice types—Asian, African and wild species—were being managed essentially as three separate germplasm collections, each with its own data management system. What a nightmare! It was almost impossible to get a quick answer to any simple question, such as ‘How many accessions are there in the genebank from Sri Lanka?’ It took three staff to query the databases, formulating their queries in slightly different ways because of the different database structures.

But why was it necessary to ask such questions, and require a rapid response? In 1993 the Convention on Biological Diversity (CBD) came into force. I had anticipated that IRRI would receive an increasing number of requests from different countries about the status and disposition of rice germplasm from each that was conserved in the IRG. Until we had an effective data management system we would have to continue trawling through decades of paperwork to find answers. And indeed there was an increase in such requests as countries became concerned that their germplasm might be misappropriated in some way or other. I should say that the changes we subsequently implemented put IRRI in good stead when the International Treaty on Plant Genetic Resources for Food and Agriculture came into force, with its requirements to track all germplasm flows and use. But I’m getting ahead of myself.

It made no sense to me that the rice types should be managed as separate collections, since once in the same genebank vaults seeds were stored under identical conditions.  So, as I indicated elsewhere on this blog, I appointed Flora de Guzman as genebank manager with overall responsibility for the entire rice collection, and started to study various aspects of germplasm regeneration and seed conservation. Since the wild rices had a special nursery screenhouse for multiplication of seed stocks (a requirement of the Philippines Quarantine Service), another member of staff became curator of the wild species on a day-to-day basis.

The data management challenge
In 1991 the IRG had three very competent data management staff: Adel Alcantara, Vanji Guevarra, and Myrna Oliva, soon to be joined by a technical assistant, Nelia Resurreccion.

Due to the lack of oversight for data management, I realized the trio were each doing their own thing for the sativas, the glaberrimas, and the wild species, so to speak, with limited reference to what the others were doing. To make any significant improvements to data management, it would be necessary to build a single data system for all germplasm in the genebank. I thought this would be quite a straightforward undertaking, taking maybe a couple of months or so. How wrong I was! It was much more complex than I had, in my naivety, envisaged.

Back in 1991, PC technology was still in its infancy; well maybe approaching juvenility. The databases were managed using ORACLE on a VAX mainframe. More nightmares! Fortunately, with some investment in office design and furniture, providing each staff with a proper workstation, and the ability to work better as a team, and more powerful PCs, we were able to migrate the new data management systems to local servers. We left the VAX behind, but unfortunately still had an ORACLE legacy that was far more difficult to ditch. I also wanted to develop an online data management system that would permit researchers at IRRI, and eventually around the world, to access germplasm data for themselves rather than always having to request information from genebank staff. This was the less than ideal situation when I joined IRRI. In fact, in order to access genebank data then it was necessary to make a request in writing that was approved by the head of the genebank, then Dr TT Chang. I put a stop to that right away. Because data had been accumulated using public funds they should be made freely available henceforth to anyone.  Direct and unhindered access to genebank data was my goal.

The underlying problem
However, the three databases could not ‘talk’ to one another, because their structures and data were different for the three ‘collections’. Let me explain.

There are basically two types of germplasm data, what we call passport data, and characterization and evaluation data. The passport data include such pieces of information as the identity of germplasm (often referred to as the accession number), the donor number and the collector’s number, for example. These data are, or should be, unique to a piece of germplasm or an accession. But passport data also include information about the date of acquisition, when it was first stored in the genebank, who has requested a seed sample, and when. Of course there’s a great deal more, but these examples suffice to explain something of the nature of these data.

Characterization (qualitative) and evaluation (mainly quantitative) data describe various aspects (or traits as they are known) of rice plants such as leaf and grain color, or plant height, days to flowering, and resistance or tolerance to pests and diseases, using agreed sets of descriptors and scoring codes or actual measurements. The International Board for Plant Genetic Resources (IBPGR, which became the International Plant Genetic Resources Institute, then Bioversity International) had developed these crop descriptors, and the first—for rice—was published jointly with IRRI in 1980 (and revised and updated in 2007).

An essential condition for a successful data management system therefore is that information is recorded and stored consistently. In order for the three databases to talk to each other, we had to correct any differences in database structure, such as the naming and structure of database fields, as well as consistent use of codes, units, etc. for the actual information. This is what we discovered.

Take the most basic (and one of the most important) database field for accession number, for example. In one database, this field was named ‘ACC_NO’, in another ‘ACCNO’. And the structure was different as well. For the sativas it was a five digit numeric field; for the glaberrimas, a six digit numeric field; and for the wild species, a seven digit alphanumeric field. No wonder the databases couldn’t talk to each other at the most basic level.

But why were there three structures? The field name was easily resolved, incidentally. Well, when the collection was first established, the accession numbers from ‘00001’ to ‘99999’ were reserved for the O. sativa accessions. Then the the numbers from ‘100000’ and above were assigned to O. glaberrima and the wild species. However, thirteen wild species samples were found to be mixtures of two species. So they were divided and each given a suffix ‘A’ or ‘B’, such as ‘100569A’ and ‘100569B’ (not actual numbers, just illustrative). That meant that the wild species now had a seven digit alphanumeric field. Why one of the mixture wasn’t just assigned a new six digit number—as we did—I’ll never understand. Then we had to convert the O. sativa accession number into a six digit numeric field (‘000001’ etc.) and, with a consistent field name across databases (‘ACCNO’ perhaps), we could then link databases for the first time. In 1991, there was a gap between the sativa numbers (perhaps between ‘80000’ and ‘99999’) before the other accessions started at ‘100000’. Irrespective of rice type, we just inserted consecutive numbers as we received new samples, until there were no gaps at all in the sequence.

White is white, yeah?
Now imagine achieving consistency right across the databases for all fields. We found that a character was often recorded/coded in different ways between rice types. So in one, the color ‘white’ might have been coded as a ‘1’, but as a ‘5’ in another. Or ‘1’ was ‘green’ in another database. And so it went on. We had to convert all codes to a meaningful and consistent description, each independent of the other. So ‘1’ was converted in one database to ‘white’ and ‘5’ to ‘white’ as well, etc. Having made all these conversions, with very careful cross checking along the way, and regular data back-ups, we finally had consistent field names and structures, and recording/coding of data for the entire germplasm collection. I don’t remember exactly how long this took, but it must have been between 18 months and two years.

The next step
IRGCISBut once completed, we could move on to the next phase of developing an online system to access genebank data, the International Rice Genebank Collection Information System (IRGCIS), with inputs from the former System-wide Genetic Resources Program (SGRP), an initiative of all the CGIAR centers with genebanks and genetic resources activities.

IRGCIS is a comprehensive system that manages the data of all rice germplasm conserved at IRRI.  It is designed to manage the genebank operations more efficiently. It links all operations associated with germplasm conservation and management from acquisition of samples through seed multiplication, conservation, characterization, rejuvenation and distribution to end-users.

The system aims to:

  • Assist the genebank staff in day-to-day activities.
  • Facilitate recording, storage and maintenance of germplasm data.
  • Allow the request of desired seeds and provide direct access to information about accessions in the genebank.

The data that are accessible are:

  • Passport data.
  • Morpho-agronomic descriptions.
  • Evaluation data on the International Rice Genebank Collection.
  • Germplasm availability.

A couple of years after IRGCIS, work began to develop the International Rice Information System (IRIS) as part of the International Crop Information System (ICIS) for the management of improved germplasm, breeding lines and the like, with full genealogy data. INGER also developed the INGERIS, but to tell the truth I’m not sure exactly where IRRI is these days with regard to cross system integration and the like.

But as I mentioned earlier, of one thing I am certain. Had we not taken the fundamental steps to clean up our data management act almost 25 years ago, we would not have had an effective platform to respond to global germplasm initiatives like the International Treaty or CBD, nor take advantage relatively easily of new data management software and hardware. It did require that broad perspective in the first instance. That I could bring to the party even though I didn’t have the technical know-how to undertake the detailed work myself.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

¹ Source: the International Rice Genebank Collection Information System (IRGCIS), 8 June 2015.