Path | Short | Definition | Comments |
---|---|---|---|
Information about a biological sequence | Raw data describing a biological sequence. | ||
identifier | Unique ID for this particular sequence. This is a FHIR-defined id | A unique identifier for this particular sequence instance. This is a FHIR-defined id. | |
type | aa | dna | rna | Amino Acid Sequence/ DNA Sequence / RNA Sequence. | |
coordinateSystem | Base number of coordinate system (0 for 0-based numbering or coordinates, inclusive start, exclusive end, 1 for 1-based numbering, inclusive start, inclusive end) | Whether the sequence is numbered starting at 0 (0-based numbering or coordinates, inclusive start, exclusive end) or starting at 1 (1-based numbering, inclusive start and inclusive end). | |
patient | Who and/or what this is about | The patient whose sequencing results are described by this resource. | |
specimen | Specimen used for sequencing | Specimen used for sequencing. | |
device | The method for sequencing | The method for sequencing, for example, chip information. | |
performer | Who should be responsible for test result | The organization or lab that should be responsible for this result. | |
quantity | The number of copies of the sequence of interest. (RNASeq) | The number of copies of the sequence of interest. (RNASeq). | |
referenceSeq | A sequence used as reference | A sequence that is used as a reference to describe variants that are present in a sequence analyzed. | |
referenceSeq.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
referenceSeq.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
referenceSeq.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
referenceSeq.chromosome | Chromosome containing genetic finding | Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication ([SO:0000340](http://www.sequenceontology.org/browser/current_svn/term/SO:0000340)). | |
referenceSeq.genomeBuild | The Genome Build used for reference, following GRCh build versions e.g. 'GRCh 37' | The Genome Build used for reference, following GRCh build versions e.g. 'GRCh 37'. Version number must be included if a versioned release of a primary build was used. | |
referenceSeq.orientation | sense | antisense | A relative reference to a DNA strand based on gene orientation. The strand that contains the open reading frame of the gene is the "sense" strand, and the opposite complementary strand is the "antisense" strand. | |
referenceSeq.referenceSeqId | Reference identifier | Reference identifier of reference sequence submitted to NCBI. It must match the type in the MolecularSequence.type field. For example, the prefix, “NG_” identifies reference sequence for genes, “NM_” for messenger RNA transcripts, and “NP_” for amino acid sequences. | |
referenceSeq.referenceSeqPointer | A pointer to another MolecularSequence entity as reference sequence | A pointer to another MolecularSequence entity as reference sequence. | |
referenceSeq.referenceSeqString | A string to represent reference sequence | A string like "ACGT". | |
referenceSeq.strand | watson | crick | An absolute reference to a strand. The Watson strand is the strand whose 5'-end is on the short arm of the chromosome, and the Crick strand as the one whose 5'-end is on the long arm. | |
referenceSeq.windowStart | Start position of the window on the reference sequence | Start position of the window on the reference sequence. If the coordinate system is either 0-based or 1-based, then start position is inclusive. | |
referenceSeq.windowEnd | End position of the window on the reference sequence | End position of the window on the reference sequence. If the coordinate system is 0-based then end is exclusive and does not include the last position. If the coordinate system is 1-base, then end is inclusive and includes the last position. | |
variant | Variant in sequence | The definition of variant here originates from Sequence ontology ([variant_of](http://www.sequenceontology.org/browser/current_svn/term/variant_of)). This element can represent amino acid or nucleic sequence change(including insertion,deletion,SNP,etc.) It can represent some complex mutation or segment variation with the assist of CIGAR string. | |
variant.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
variant.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
variant.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
variant.start | Start position of the variant on the reference sequence | Start position of the variant on the reference sequence.If the coordinate system is either 0-based or 1-based, then start position is inclusive. | |
variant.end | End position of the variant on the reference sequence | End position of the variant on the reference sequence.If the coordinate system is 0-based then end is exclusive and does not include the last position. If the coordinate system is 1-base, then end is inclusive and includes the last position. | |
variant.observedAllele | Allele that was observed | An allele is one of a set of coexisting sequence variants of a gene ([SO:0001023](http://www.sequenceontology.org/browser/current_svn/term/SO:0001023)). Nucleotide(s)/amino acids from start position of sequence to stop position of sequence on the positive (+) strand of the observed sequence. When the sequence type is DNA, it should be the sequence on the positive (+) strand. This will lay in the range between variant.start and variant.end. | |
variant.referenceAllele | Allele in the reference sequence | An allele is one of a set of coexisting sequence variants of a gene ([SO:0001023](http://www.sequenceontology.org/browser/current_svn/term/SO:0001023)). Nucleotide(s)/amino acids from start position of sequence to stop position of sequence on the positive (+) strand of the reference sequence. When the sequence type is DNA, it should be the sequence on the positive (+) strand. This will lay in the range between variant.start and variant.end. | |
variant.cigar | Extended CIGAR string for aligning the sequence with reference bases | Extended CIGAR string for aligning the sequence with reference bases. See detailed documentation [here](http://support.illumina.com/help/SequencingAnalysisWorkflow/Content/Vault/Informatics/Sequencing_Analysis/CASAVA/swSEQ_mCA_ExtendedCIGARFormat.htm). | |
variant.variantPointer | Pointer to observed variant information | A pointer to an Observation containing variant information. | |
observedSeq | Sequence that was observed | Sequence that was observed. It is the result marked by referenceSeq along with variant records on referenceSeq. This shall start from referenceSeq.windowStart and end by referenceSeq.windowEnd. | |
quality | An set of value as quality of sequence | An experimental feature attribute that defines the quality of the feature in a quantitative way, such as a phred quality score ([SO:0001686](http://www.sequenceontology.org/browser/current_svn/term/SO:0001686)). | |
quality.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
quality.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
quality.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
quality.type | indel | snp | unknown | INDEL / SNP / Undefined variant. | |
quality.standardSequence | Standard sequence for comparison | Gold standard sequence used for comparing against. | |
quality.start | Start position of the sequence | Start position of the sequence. If the coordinate system is either 0-based or 1-based, then start position is inclusive. | |
quality.end | End position of the sequence | End position of the sequence.If the coordinate system is 0-based then end is exclusive and does not include the last position. If the coordinate system is 1-base, then end is inclusive and includes the last position. | |
quality.score | Quality score for the comparison | The score of an experimentally derived feature such as a p-value ([SO:0001685](http://www.sequenceontology.org/browser/current_svn/term/SO:0001685)). | |
quality.method | Method to get quality | Which method is used to get sequence quality. | |
quality.truthTP | True positives from the perspective of the truth data | True positives, from the perspective of the truth data, i.e. the number of sites in the Truth Call Set for which there are paths through the Query Call Set that are consistent with all of the alleles at this site, and for which there is an accurate genotype call for the event. | |
quality.queryTP | True positives from the perspective of the query data | True positives, from the perspective of the query data, i.e. the number of sites in the Query Call Set for which there are paths through the Truth Call Set that are consistent with all of the alleles at this site, and for which there is an accurate genotype call for the event. | |
quality.truthFN | False negatives | False negatives, i.e. the number of sites in the Truth Call Set for which there is no path through the Query Call Set that is consistent with all of the alleles at this site, or sites for which there is an inaccurate genotype call for the event. Sites with correct variant but incorrect genotype are counted here. | |
quality.queryFP | False positives | False positives, i.e. the number of sites in the Query Call Set for which there is no path through the Truth Call Set that is consistent with this site. Sites with correct variant but incorrect genotype are counted here. | |
quality.gtFP | False positives where the non-REF alleles in the Truth and Query Call Sets match | The number of false positives where the non-REF alleles in the Truth and Query Call Sets match (i.e. cases where the truth is 1/1 and the query is 0/1 or similar). | |
quality.precision | Precision of comparison | QUERY.TP / (QUERY.TP + QUERY.FP). | |
quality.recall | Recall of comparison | TRUTH.TP / (TRUTH.TP + TRUTH.FN). | |
quality.fScore | F-score | Harmonic mean of Recall and Precision, computed as: 2 * precision * recall / (precision + recall). | |
quality.roc | Receiver Operator Characteristic (ROC) Curve | Receiver Operator Characteristic (ROC) Curve to give sensitivity/specificity tradeoff. | |
quality.roc.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
quality.roc.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
quality.roc.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
quality.roc.score | Genotype quality score | Invidual data point representing the GQ (genotype quality) score threshold. | |
quality.roc.numTP | Roc score true positive numbers | The number of true positives if the GQ score threshold was set to "score" field value. | |
quality.roc.numFP | Roc score false positive numbers | The number of false positives if the GQ score threshold was set to "score" field value. | |
quality.roc.numFN | Roc score false negative numbers | The number of false negatives if the GQ score threshold was set to "score" field value. | |
quality.roc.precision | Precision of the GQ score | Calculated precision if the GQ score threshold was set to "score" field value. | |
quality.roc.sensitivity | Sensitivity of the GQ score | Calculated sensitivity if the GQ score threshold was set to "score" field value. | |
quality.roc.fMeasure | FScore of the GQ score | Calculated fScore if the GQ score threshold was set to "score" field value. | |
readCoverage | Average number of reads representing a given nucleotide in the reconstructed sequence | Coverage (read depth or depth) is the average number of reads representing a given nucleotide in the reconstructed sequence. | |
repository | External repository which contains detailed report related with observedSeq in this resource | Configurations of the external repository. The repository shall store target's observedSeq or records related with target's observedSeq. | |
repository.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
repository.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
repository.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
repository.type | directlink | openapi | login | oauth | other | Click and see / RESTful API / Need login to see / RESTful API with authentication / Other ways to see resource. | |
repository.url | URI of the repository | URI of an external repository which contains further details about the genetics data. | |
repository.name | Repository's name | URI of an external repository which contains further details about the genetics data. | |
repository.datasetId | Id of the dataset that used to call for dataset in repository | Id of the variant in this external repository. The server will understand how to use this id to call for more info about datasets in external repository. | |
repository.variantsetId | Id of the variantset that used to call for variantset in repository | Id of the variantset in this external repository. The server will understand how to use this id to call for more info about variantsets in external repository. | |
repository.readsetId | Id of the read | Id of the read in this external repository. | |
pointer | Pointer to next atomic sequence | Pointer to next atomic sequence which at most contains one variant. | |
structureVariant | Structural variant | Information about chromosome structure variation. | |
structureVariant.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
structureVariant.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
structureVariant.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
structureVariant.variantType | Structural variant change type | Information about chromosome structure variation DNA change type. | |
structureVariant.exact | Does the structural variant have base pair resolution breakpoints? | Used to indicate if the outer and inner start-end values have the same meaning. | |
structureVariant.length | Structural Variant Length | Length of the variant chromosome. | |
structureVariant.outer | Structural variant outer | Structural variant outer. | |
structureVariant.outer.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
structureVariant.outer.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
structureVariant.outer.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
structureVariant.outer.start | Structural Variant Outer Start | Structural Variant Outer Start.If the coordinate system is either 0-based or 1-based, then start position is inclusive. | |
structureVariant.outer.end | Structural Variant Outer End | Structural Variant Outer End. If the coordinate system is 0-based then end is exclusive and does not include the last position. If the coordinate system is 1-base, then end is inclusive and includes the last position. | |
structureVariant.inner | Structural variant inner | Structural variant inner. | |
structureVariant.inner.id | Unique id for inter-element referencing | Unique id for the element within a resource (for internal references). This may be any string value that does not contain spaces. | |
structureVariant.inner.extension | Additional content defined by implementations | May be used to represent additional information that is not part of the basic definition of the element. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
structureVariant.inner.modifierExtension | Extensions that cannot be ignored even if unrecognized | May be used to represent additional information that is not part of the basic definition of the element and that modifies the understanding of the element in which it is contained and/or the understanding of the containing element's descendants. Usually modifier elements provide negation or qualification. To make the use of extensions safe and manageable, there is a strict set of governance applied to the definition and use of extensions. Though any implementer can define an extension, there is a set of requirements that SHALL be met as part of the definition of the extension. Applications processing a resource are required to check for modifier extensions. Modifier extensions SHALL NOT change the meaning of any elements on Resource or DomainResource (including cannot change the meaning of modifierExtension itself). | There can be no stigma associated with the use of extensions by any application, project, or standard - regardless of the institution or jurisdiction that uses or defines the extensions. The use of extensions is what allows the FHIR specification to retain a core level of simplicity for everyone. |
structureVariant.inner.start | Structural Variant Inner Start | Structural Variant Inner Start.If the coordinate system is either 0-based or 1-based, then start position is inclusive. | |
structureVariant.inner.end | Structural Variant Inner End | Structural Variant Inner End. If the coordinate system is 0-based then end is exclusive and does not include the last position. If the coordinate system is 1-base, then end is inclusive and includes the last position. |
The Clinical Genomics committee has identified overlaps and redundancies between content in the MolecularSequence resource and content in Observation profiles in the evolving Implementation Guide for Clinical Genomics Reporting found here. The committee is considering options for modifying the resource and anticipates potential changes being brought forward in an upcoming ballot.
The MolecularSequence resource is designed to describe an atomic sequence which contains the alignment sequencing test result and multiple variations. Atomic sequences can be connected by link element and they will lead to sequence graph. By this method, a sequence can be reported. Complete genetic sequence information, of which specific genetic variations are a part, is reported by reference to the GA4GH repository. Thus, the FHIR MolecularSequence resource avoids large genomic payloads in a manner analogous to how the FHIR ImagingStudy resource references large images maintained in other systems. For use cases, details on how this resource interact with other Clinical Genomics resources or profiles, please refer to implementation guidance document here .
This resource is designed to describe sequence variations with clinical significance with information such as:
It is strongly encouraged to provide all available information in this resource for any reported variants, because receiving systems (e.g. discovery research, outcomes analysis, and public health reporting) may use this information to normalize variants over time or across sources. However, these data should not be used to dynamically correct/change variant representations for clinical use outside of the laboratory, due to insufficient information.
Implementers should be aware that semantic equivalency of results of genetic variants cannot be guaranteed unless there is an agreed upon standard between sending and receiving systems.
Focus of the resource is to provide sequencing alignment data immediately relevant to what the interpretation on clinical decision-making originates from. Hence data such as precise read of DNA sequences and sequence alignment are not included; such data are nonetheless accessible through references to GA4GH (Global Alliance for Genomics and Health) API. The MolecularSequence resource will be referenced by Observation to provide variant information. As clinical assessments/diagnosis of a patient are typically captured in the Condition resource or the ClinicalImpression resource, the MolecularSequence resource can be referenced by the Condition resource to provide specific genetic data to support assertions. This is analogous to how Condition references other resources, such as AllergyIntolerance, Procedure, and Questionnaire resources.
When saving the variant information, the nucleic acid will be numbered with order. Some files are using 0-based coordinates (e.g. BCD file format) while some files are using 1-based coordinates (e.g. VCF file format). The element coordinateSystem in MolecularSequence resource contains this information.
MolecularSequence.coordinateSystem constraints within two possible values: 0 for 0-based system, which will mark the sequence from number 0, while 1 for 1-based system, which will begin marking the first position with number 1. The significant difference between two system is the end position. In 0-based system, the end position is exclusive, which means the last position will not be contained in the sequence window while in 1-based system, the end position is inclusive , which means the last position is included in the sequence window. Note both systems has an inclusive start position.
For example, ACGTGCAT will be numbered from 1 to 8 in 1-based system and will be numbered from 0 to 8 in 0-based system to mark flanks (i.e. place between two Nucleotide). So the interval [3,5] in 1-based system is GTG while interval [2,5) in 0-based system is same segment GTG.
There are lots of definition concerning with the Directionality of DNA or RNA. Here we are using referenceSeq.orientation and referenceSeq.strand. orientation represents the sense of the sequence, which has different meanings depending on the MolecularSequence.type. strand represents the sequence writing order. Watson strand refers to 5' to 3' top strand (5' -> 3'), whereas Crick strand refers to 5' to 3' bottom strand (3' <- 5').
We hope that string of observedSeq can be constrained more than just any normal string but with notation tables. Here we present what the nucleotide acid string should be constrained within the range:
A --> adenosine | M --> A C (amino) | U --> uridine | H --> A C T | V --> G C A |
C --> cytidine | S --> G C (strong) | D --> G A T | K --> G T (keto) | |
G --> guanine | W --> A T (weak) | R --> G A (purine) | N --> A G C T (any) | |
T --> thymidine | B --> G T C | Y --> T C (pyrimidine) | - --> gap of indeterminate length |
A alanine | P proline | B aspartate or asparagine | Q glutamine |
C cystine | R arginine | D aspartate | S serine |
E glutamate | T threonine | F phenylalanine | U selenocysteine |
G glycine | V valine | H histidine | W tryptophan |
I isoleucine | Y tyrosine | K lysine | Z glutamate or glutamine |
L leucine | X any | M methionine | * translation stop |
N asparagine | - gap of indeterminate length |
chromosome | Chromosome number of the reference sequence | MolecularSequence.referenceSeq.chromosome |
end | End position (0-based exclusive, which menas the acid at this position will not be included, 1-based inclusive, which means the acid at this position will be included) of the reference sequence. | MolecularSequence.referenceSeq.windowEnd |
identifier | The unique identity for a particular sequence | MolecularSequence.identifier |
patient | The subject that the observation is about | MolecularSequence.patient |
referenceseqid | Reference Sequence of the sequence | MolecularSequence.referenceSeq.referenceSeqId |
start | Start position (0-based inclusive, 1-based inclusive, that means the nucleic acid or amino acid at this position will be included) of the reference sequence. | MolecularSequence.referenceSeq.windowStart |
type | Amino Acid Sequence/ DNA Sequence / RNA Sequence | MolecularSequence.type |
chromosome-coordinate | Search parameter for region of the chromosome sequence string. This will refer to part of a locus or part of a gene where search region will be represented in 1-based system. Since the coordinateSystem can either be 0-based or 1-based, this search query will include the result of both coordinateSystem that contains the equivalent segment of the gene or whole genome sequence. For example, a search for sequence can be represented as `chromosome-coordinate=1$lt345$gt123`, this means it will search for the MolecularSequence resource on chromosome 1 and with position >123 and <345, where in 1-based system resource, all strings within region 1:124-344 will be revealed, while in 0-based system resource, all strings within region 1:123-344 will be revealed. You may want to check detail about 0-based v.s. 1-based above. | MolecularSequence |
referenceseqid-coordinate | Search parameter for region of the reference sequence. This will refer to part of a locus or part of a gene where search region will be represented in 1-based system. Since the coordinateSystem can either be 0-based or 1-based, this search query will include the result of both coordinateSystem that contains the equivalent segment of the gene or whole genome sequence. For example, a search for sequence can be represented as `referenceSeqId-coordinate=NC_000001.11$lt345$gt123`, this means it will search for the MolecularSequence resource on NC_000001.11 and with position >123 and <345, where in 1-based system resource, all strings within region NC_000001.11:124-344 will be revealed, while in 0-based system resource, all strings within region NC_000001.11:123-344 will be revealed. You may want to check detail about 0-based v.s. 1-based above. | MolecularSequence |