MIQAS—Minimum Information for QTLs and Association Studies

What is MIQAS?

The issue

Genetics researchers have long utilized QTL and association studies to try and pinpoint those parts of the genome that underlie disease or production phenotypes. A wealth of papers exists describing these results, resulting in the opportunity to fine-map QTL regions. If several papers describe a QTL for the same trait and those QTLs overlap with each other, it is the overlap between these QTLs that is likely to contain the gene of interest.

Unfortunately, even though much data is ‘out there’, it cannot be fully used. The current practice for doing QTL and association studies is presented in the figure below. Basically, wet-lab research (genotyping and phenotyping) is followed by a statistical analysis to identify QTLs or associated markers and results are published in a paper. Even though the paper in itself can be valuable, the data and results embedded in it can be hard to extract. In addition, QTL and association database curators have to mine literature and identify those papers that hold relevant information.

In addition to the issues involved with this practice, the fact that no minimum standards are set results in highly inconsistent reporting. Some of the issues include:

Often a QTL region is described by a cM range without mentioning what linkage map the cM position refers to. This makes the results useless for subsequent analysis.
The criterion used to define a QTL region is almost never mentioned in a paper (e.g. “region where the LOD score is higher than 3”).
It is often very difficult to transfer information to other (linkage and radiation hybrid) maps because many markers can have the same name, and a single marker can have many names. This issue of marker identity is a direct result of solely having to rely on marker names rather than marker accession numbers.

A solution

The MIQAS set of rules accompanied with the standardized XML and tab-delimited file formats will serve two goals:

to encourage research groups that wish to publish a QTL paper to provide and submit the necessary information that would make meta-analysis possible.
to allow easy interchange of data between different QTL and association analysis databases. Databases that implement the standardized XML format will typically write an import and an export filter to read data from and dump data into that an XML file. This is the same approach as used for the exchange of sequences between NCBI, Ensembl and DDBJ at the early stages of the Human Genome Project (see the description of the CAF format in Lincoln Stein’s How Perl Saved the Human Genome Project).

What should the workflow for the future look like?

Several research communities have already made a similar step to make the above happen. Good examples are the publication of results for microarray and SNP discovery efforts. If a researcher wants to get a paper published that describes the discovery of new polymorphic markers (e.g. SNPs), it is often no longer possible to get away with listing the PCR primers in a long table within the manuscript. More and more, journals request that those markers are first submitted to a central database such as dbSNP and that only the accession numbers of the markers are mentioned in the paper. Even though this involves a little bit of added work for the scientist, the added value of having the markers available through dbSNP outweighs this disadvantage by far. The microarray community uses the same approach with the MIAME standard.

Using MIQAS, the QTL and association research community can start to employ a similar paradigm for reporting their results. Before submitting the manuscript to a journal, the research group first submits the data and results to a MIQAS-compliant database. This database creates an accession number for the study that can be referenced in the manuscript.

Parties involved

To make a workflow as described above (similar to microarry results) work, several parties have to play a role in the process:

QTL software can make it easy for the researcher to get their data and results into a MIQAS file format. This can be achieved by creating output filters (“File -> Export to MIQAS”) that generate a MIQAS_XML file for a particular study and holding information on all aspects of the experiment.
MIQAS-compliant databases need to (a) be able to hold all information specified in the MIQAS standard, (b) be able to load MIQAS-formatted data using an import filter and© be able to export QTL or association studies into MIQAS_XML format.
Journals need to encourage researchers to submit their data and results to a MIQAS-compliant database before submitting the manuscript.

Specification

Overview

The central thing described in a MIQAS_XML file is a QTL or association study. This means that if a study results in more than one QTL, all of these can be described together in one MIQAS file.

Apart from general information about the QTL/association analysis like a name, description and reference, most of the information refers to (1) the trait that was measured, (2) the experiment that was used and (3) the associated markers or positions of any QTLs.

1. MIQAS_TAB

The MIQAS_TAB specification outlines the format for a group of tab-delimited files to support submission and exchange of MIQAS-related data. This is your primary source of information about what aspects of a QTL or association study have to be recorded.

2. MIQAS_XML

An XML Schema (called MIQAS_XML and available here) is available for data exchange between MIQAS-compliant databases. This file format is also to be used to submit data to a MIQAS-compliant database. For an example of such a file, see here. As there’s a one-to-one mapping with the MIQAS_TAB specification, please see there for more information.

Getting your data in MIQAS format

There are two basic ways that researchers can get their data and results in a MIQAS XML file:

Supporting QTL software

If you’re lucky, you’ve been using one of the software suites that has MIQAS support built-in and allows you to export all data and results to MIQAS_XML. Software that is incorporating this functionality are (still a work in progress):

Using MIQAS_TAB

You can also create a group of tab-delimited files according to the MIQAS_TAB specification. A website will be created (but not available yet) that can convert these files into a single MIQAS_XML file.

Presentations

PAG 2007 in San Diego.
Minimalistic presentation given at the Institute

Contributors

In alphabetical order:

Jan Aerts (KU Leuven, formerly at Wellcome Trust Sanger Institute and Roslin Institute)
Dave Burt (Roslin Institute)
Wilfrid Carre (Roslin Institute)
DJ DeKoning (Roslin Institute)
Chris Haley (Roslin Institute)
Zhiliang Hu (Iowa State University)
Andy Law (Roslin Institute)
Jim Reecy (Iowa State University)
...and many other with their discussions

Contact

Please send inquiries to jan.aerts@kuleuven.be or jreecy@iastate.edu