Spectral Karyotyping (SKY) (1–7) and Comparative Genomic Hybidization (CGH) (8–11) are complementary fluorescent molecular cytogenetic techniques that have revolutionized the detection of chromosomal abnormalities. SKY permits the simultaneous visualization of all human or mouse chromosomes in a different color, facilitating the detection of chromosomal translocations and rearrangements (Figure 1). CGH uses the hybridization of differentially labeled tumor and reference0 DNA to generate a map of DNA copy number changes in tumor genomes.
A metaphase spread after SKY hybridization. The RGB image demonstrates cytogenetic abnormalities in a cell from a secondary leukemia cell line.
The goal of the SKY/CGH database is to allow investigators to submit and analyze both clinical and research (e.g., cell lines) SKY and CGH data. The database is growing and currently has a total of about 700 datasets, some of which are being held private until published. Several hundred labs around the world use this technique, with many more looking at the data they generate. Submitters can enter data from their own cases in either of two formats, public or private; the public data is generally that which has already been published, whereas the private data can be viewed only by the submitters, who can transfer it to the public format at their discretion. The results are stored under the name of the submitter and are listed according to case number. The homepage [http:// www.ncbi.nlm.nih.gov/sky/] includes a basic description of SKY and CGH techniques and provides links to a more detailed explanation and relevant literature.
Detailed information on how to submit data either to the SKY or CGH sectors of the database can be found through links on the homepage [http://www.ncbi.nlm.nih.gov/sky/]. What follows is a brief outline.
The submitter enters the written karyotype, the number of normal and abnormal copies for each chromosome, and the number of cells for each clone. Each abnormal chromosome segment is then described by typing in the beginning and ending bands, starting from the top of the chromosome the computer then builds a colored ideogram of this chromosome and eventually a full karyotype (SKYGRAM) with each normal and abnormal chromosome displayed in its unique SKY classification color, with band overlay Each breakpoint submitted is automatically linked by a button marked FISH to the human Map Viewer , which provides a list of genes at that site and available FISH clones for that breakpoint.
The CGH database displays gains, losses, and amplification of chromosomes and chromosome segments. The data are entered either by hand or automatically from various CGH software programs. In the manual format, the submitter enters the band information for each affected chromosome, describing the start band and stop band for each gain or loss, and the computer program displays the final karyotype with vertical bars to the left or right of each chromosome, indicating loss or gain, respectively
Clinical data submitted include case identification, World Health Organization (WHO) disease
classification code, diagnosis, organ, tumor type, and disease stage. To obtain the correct classification
code, a link is provided to the NCI's MetathesaurusTM site [http://ncievs.nci.nih.gov/NCIMetaphrase. html], which includes, among its many systems, the codes developed by the WHO and NCI, and published as the International Classification of DiseasesThe references for the published cases are entered into the Case Information page and are linked to their abstracts in PubMed.
A colored karyotype with band overlay is presented to the submitter, who then builds each aberrant chromosome by cutting and pasting (by clicking with the mouse at appropriate breakpoints) . Each aberrant chromosome is then inserted into the full karyotype, which is displayed as a SKYGRAM.
To speed up the entry of cytogenetic data into the database, NCBI has built a computer program to automatically read short-form karyotypes, extract the information therein, and insert it into the SKY database (Figure 7). Karyotypes are written according to specific rules described in
An International System for Human Cytogenetic Nomenclature (1995) (12). Using these rules, the parser (1) breaks the karyotype into small syntactic components, (2) assembles information from these components into an information structure in computer memory, (3) transforms this information into the formats required for an application, and (4) uses the information in the application, i. e., inserts it into the database. To accomplish this, the syntactic parser first extracts the information out of each piece of the input; the pieces are then put directly into a tree structure that represents karyotype semantics. For insertion into the SKY database, the karyotype information is transformed into ASN.1 structures that reflect the design of the database.All chromosomal bands, including breakpoints involved in chromosomal abnormalities, are linked to the Map Viewer database (Figure 4 ; see also Chapter 20) The Map Viewer [http://www.ncbi.nlm.nih.gov/mapview] provides graphical displays of features on NCBI's assembly of human genomic sequence data as well as cytogenetic, genetic, physical, and radiation hybrid maps. Map features that can be seen along the sequence include NCBI contigs, the BAC tiling path, and the location of genes, STSs, FISH mapped clones, ESTs, GenomeScan models, and variation (SNPs; see Chapter 5).
Links are provided to related Web sites including: chromosome databases (e.g., the Mitelman database); other NCI (e.g., CGAP and CCAP) and NCBI [e.g., the Map Viewer (Chapter 20), LocusLink (Chapter 19) resources; and PubMed (Chapter 2)] sites; The Jackson Laboratory; and several other CGH sites.