Term
Which NCBI database archives, distributes and supports submission of data that correlate genomic characteristics with observable traits and is a designated NIH repository for genome-wide association study (GWAS) results? A) dbGaP B)dbSNP C)dbVAR D) OMIM |
|
Definition
|
|
Term
Generating protein alignments can generally be more informative than DNA alignments because:
A) There are 20 AA vs. 4 bases and many AA share related biophysical properties
B) Codons are degenerate and many DNA mutations do not alter the AA
C) Protein sequences offer a longer "look-back" time
D) All of the above |
|
Definition
|
|
Term
Orthologs are defined as:
A) Homologous sequences in different species that share an ancestral gene
B) Homologous sequences that share little AA identity but share great structural similarity
C) Homologous sequences in the same species that arose through duplication
D) Homologous sequences in the same species which have similar and often redundant functions |
|
Definition
Homologous sequences in different species that share an ancestral gene. |
|
|
Term
The extent to which two nucleotide or protein sequences are related is defined as:
A) Identity
B) Similarity
C) Conservation
D) Homology |
|
Definition
Similarity (identity + conservation) |
|
|
Term
Which of the following AA is least mutable according to the PAM scoring matrix?
A) Alanine
B) Glutathione
C) Methionine
D) Cysteine |
|
Definition
|
|
Term
Affine gaps refer to:
A) System of assessing gap penalties where presence of a gap is assigned more significance than the length of the gap.
B) System of gap penalties whereby gaps are deleted in pairwise comparisions between sequences
C) System of gap penalties whereby gaps are adjusted according to pairwise comparisons in MSA
D) System of gap penalties whereby presence of a gap is assigned the same score as the length of the gap |
|
Definition
System of assessing gap penalties where presence of a gap is assigned more significance than the length of the gap. |
|
|
Term
The PAM250 matrix is defined as having an evolutionary divergence in which what percentage of amino acids between two homologous sequences have changed over time?
A) 1%
B) 20%
C) 80%
D) 250% |
|
Definition
|
|
Term
PAM Matrices:
A) Are based upon global alignemnts of closely related proteins
B) Was originally derived from 34 protein superfamilies
C) Define the score of two aligned residues i,j as 10 times the log of how likely it is to observe these two residues divided by the background probability of finding these AA by chance.
D) All of the above |
|
Definition
|
|
Term
Which of the following sentences best descrives the difference between a global alignment and a local alignment between two sequences?
A) Global alignment is usually used for DNA, while local alignment is usually used for protein sequences.
B) Global alignmnet has gaps, while local alignments does not have gaps
C) Global alignment finds the global max, while the local alignment finds the local max.
D) Global alignment aligns the whole sequence, while local alignment finds the best sub-sequence that aligns |
|
Definition
Global alignment aligns the whole sequence while local alignment finds the best sub-sequence that aligns |
|
|
Term
In Dayhoff's original protein superfamilies, which were used to compute the PAM matrix, which protein family had the lowest rate of point-accepted-mutations per 100 million years?
A) Ig Kappa
B) Ubiquitin
C) Lysozyme
D) Insulin |
|
Definition
|
|
Term
You have two distantly related proteins. Which BLOSUM or PAM is best to use to compare them?
A) BLOSUM 45 and PAM 250
B) BLOSUM 45 and PAM 10
C) BLOSUM 80 and PAM 250
D) BLOSUM 80 and PAM 10 |
|
Definition
|
|
Term
A fundamental difference between PAM and BLOSUM matrices is that:
A) BLOSUM matrices are based on local alignments while PAMs are derived from global alignments
B) BLOSUM matrices are based on global alignments while PAMS are derived on local alignments
C) BLOSUM cannot be used for aligning very distantly related proteins (<50%) whereas PAMS are best used when aligning distantly related proteins
D) BLOSUM cannot be used in generating MSAs whereas PAMS are best used when generating MSAs. |
|
Definition
BLOSUM matrices are based on local alignments while PAMs are dervived from global alignments |
|
|
Term
Two proteins that share 30% AA identity are 30% homologous:
A) true
B) false |
|
Definition
|
|
Term
You have a reasonably short, typical, dsDNA sequence. Basically, how many proteins can it potentially encode?
A)2
B)1
C)3
D)6 |
|
Definition
|
|
Term
You have a typical stretch of 300 NT long DNA sequence containing both the start and stop codons. Upon translating the sequence how many AA should you see?
A) 100
B) 99
C) 98
D) 150 |
|
Definition
|
|
Term
You have a DNA sequence. You want to know which protein in the main protein database (Nr , in the nonredundant database) is most similar to some protein encoded by your DNA. Which program should you use?
A) BlastN
B) Blast P
C) BlastX
D) tBlastN |
|
Definition
|
|
Term
Which output from a BLAST search provides an estimate of the number of false positives?
A) E value
B) Bit score
C) Percent identity
D) Percent positives |
|
Definition
|
|
Term
You can limit a BLAST search using any Entrez term. For example you can limit the results to those containg a specific researcher's name:
A) TRUE
B) FALSE |
|
Definition
|
|
Term
An extreme value distribution:
A) Descibes the distribution of scores from a query against a database
B) Has a larger total area than a normal distribution
C) Is symmetric
D) Has a shape that is described by two constants (the means and a decay constant) |
|
Definition
Describes the distribution of scores from a query against a database |
|
|
Term
As the E value of a BLAST search becomes smaller:
A) The value K ( a searcg space paramenter) also becomes smaller
B) the score tends to be larger
C) the probability P tends to be larger
D) the extreme value distribution becomes less skewed |
|
Definition
The score tends to be larger |
|
|
Term
The BLAST algorithm complies a list of "Words" typically of three AA (for protein search). Words at or above a threshold value T are defined as:
A) "Hits" and are used to scan a database for exact matches that may then be extended
B)"Hits" and are used to scan a database for exact or partial matches that may then be extended
C) Hits and are aligned to each other
D) Hits and are reported as raw scores |
|
Definition
B)"Hits" and are used to scan a database for exact or partial matches that may then be extended |
|
|
Term
If you want literature info, what is the best website to visit?
A) OMIM
B) Entrez
C) PubMed
D) PROSITE |
|
Definition
|
|
Term
Which of the following methods allows incorporation of 3-D protein within a multiple sequence alignment?
A) CLUSTALW
B) MUSCLE
C) EXPRESSO
D) MAFFT |
|
Definition
|
|
Term
Benchmarking refers to:
A) Making a set of MSAs from closely related proteins that form a trusted alignment
B) Making a set of MSAs from proteins that have had their teriary structure determined, thus allowing the MSA to be validated based on structural criteria
C) Making a set of MSAs, with an algorithm, that are subsequently employed to refine tertiary structure predictions
D) Making a set of MSAs from proteins that are known, bases on structural criteria, to be members of distinct protein families. |
|
Definition
Making a set of MSAs from proteins that have had their tertiary structire determined, thus allowing the MSA to be validated based on structural criteria.
|
|
|
Term
Why doesnt CLUSTALW (a program that employs Feng and Doolittle progressive sequence alignment algorithm) report expect values?
A) CLUSTALW does report expect values
B) CLUSTALW uses global alignments for which E value statistics are not available
C) CLUSTALW uses local alignments for which E value stats are not available
D) E value stats are not relevant to MSAs. |
|
Definition
CLUSTALW uses global alignments for which E value statistics are not available |
|
|
Term
The "once a gap, always a gap" rule for Feng-Doolittle method:
A) Assures that gaps will not be filled in inappropriately with inserted sequences
B) Assures that sequences that diverged early in evolution will be given priority in establishing the order in which MSA is constructed
C) Assures that gaps occuring between sequences that are most closely related in MSA will be preserved
D) Assures that gaps occuring between sequences that distantly related will be maintained in the MSA. |
|
Definition
C) Assures that gaps occuring between sequences that are most closely related in MSA will be preserved |
|
|
Term
How can MSA programs improve performance?
A) By doing PSI-BLast
B) By incorporating data on secondary structure
C) By incorporating data on 3D structure
D) All of the above |
|
Definition
|
|
Term
What is the main strength of consistency based approaches (such as Probcons or T-Coffee)
A) They include info based on position-specific scoring matrices
B) They include info based on 3D protein structures, typically obtained from X-Ray crystallography
C) They perform profile-profile alignments and are extremely fast
D) They include info based on MSAs to guide the determination of pairwise alignments |
|
Definition
They include info based on MSAs to guide the determination of pairwise alignments |
|
|
Term
If you perform a MSA of a group of proteins and include a distantly related protein ( a divergent member called an "orphan"):
A) The orphan is typically aligned with the group of proteins
B) The orphan is not typically aligned with the group of proteins |
|
Definition
The orphan is typically aligned with the group of proteins |
|
|
Term
The main difference between Pfam-A and Pfam-B is that:
A) Pfam-A is manually curated while Pfam-B is automatically curated
B) Pfam-A uses Hidden Markov models while Pfam-B does not
C) Pfam-A provides full length protein alignments while Pfam-B aligns protein fragments
D) Pfam-A incorporates data from SMART and PROSITE while Pfam-B does not. |
|
Definition
Pfam-A is manually curated while Pfam-B is automatically curated |
|
|
Term
What is the feature of algorithms that align large tracts of genomic DNA, in contrast to programs such as CLUSTALW that align smaller blocks of DNA or proteins?
A) They are generally unable to align DNA from organisms that are highly divergent, such as those speciated several hundred million years ago
B) They generally use progressive alignment and so are fundamentally similar
C) They often employ anchors that help to align regions of conservation that interspered with less conserved regions ( such as those arising in the noncoding regions, deleted regions, or inverted regions )
D) They are specialized to accept very long inputs |
|
Definition
They are generally unable to align DNA from organisms that are highly divergent, such as those speciated several hundred million years ago |
|
|
Term
Applying which of the following can improve the ability of CLUSTALW to perform MSA?
A) Assign individual weights to sequences based on their divergence levels: closely related sequences, less wt, distantly related sequences, more wt.
B) Vary scoring matrices depending on the presence of conserved or divergent sequences
C) Apply reside-specific gap-penalties
D) All of the above |
|
Definition
|
|
Term
ProbCons:
a) Combines iterative and progressive approaches with unique probabilistic model.
b) Uses hidden markov Model to calculate probability matrices for matching residues, uses this to construct a guide tree.
c) Performs progressive alignment hierarchically along guide tree
d) Performs post-processing and iterative refinement (a little like MUSCLE)
e) All of the above
|
|
Definition
|
|
Term
Which of the following methods allows incorporation of 3-D protein structure within a multiple sequence alignment?
A) CLUSTALW
B) MUSCLE
C) EXPRESSO
D) MAFFT
|
|
Definition
|
|
Term
Effective "sequence search space" during a BLAST search refers to:
A) The product of effective query length and effective database length
B) The ratio of effective query length and database size
C) The size of the database searched
D) The size of the query itself |
|
Definition
The product of effective query length and effective database length |
|
|
Term
Changing which of the following BLAST parameters would tend to yield fewer search results?
A) Turning off the low-complexity filter
B) Changing the expect value from from 1 to 10
C) Raising the threshold value
D) Changing the scoring matrix from PAM30 to PAM70 |
|
Definition
Raising the threshold value |
|
|
Term
In a BLAST search the E value is:
A) the number of alignments with scores greater than or equal to score S that |
|
Definition
|
|
Term
Applying which of the following can improve the ability of CLUSTALW to perform multiple sequence alignments?
A) Assign individual weights to sequences based on their divergent levels: closely related sequences, less wt; distantly related sequences, more wt.
B) Varying scoring matrices depending on the presence of conserved or divergent sequences
C) Apply residue-specific gap-penalties
D) All of the above |
|
Definition
|
|
Term
A major difference between progressive alignment algorithm as implemented in T-Coffee versus as implemented in CLUSTALW is:
A) T-Coffee progressive alignment phase uses a dynamic algorithm with gap-opening penalties and gap extension penalties set to zero for aligning two sequences or two groups of pre-aligned sequences
B) T coffee progressive alignment phase uses a dynamic algorith with equal gap-opening and gap extension penalties set to 1
C) Whereas T-Coffee implements a hard version of introducing gaps, whereby once introduced between a pair of aligned sequences they cannot be shifted, CLUSTALW is much more flexible in introducing and then shifting gaps
D) Both T-Coffee and CLUSTALW implement the same version of dynamic progression alignment algorithm |
|
Definition
T-Coffee progressive alignment phase uses a dynamic algorithm with gap-opening penalties and gap extension penalties set to zero for aligning two sequences or two groups of pre-aligned sequences |
|
|
Term
If you were aligning a group of sequences that shared 80-100% sequence identity, which of the following PAM scoring matrices would you use for constructing a MSA using CLUSTALW?
A) PAM 350
B) PAM 120
C) PAM 20
D) PAM 35 |
|
Definition
|
|
Term
According to the molecular clock hypothesis:
A) All proteins evolve at the same, constant rate
B) All proteins evolve at a rate that matches the fossil record
C) For every given protein, the rate of molecular evolution gradually slows down like a clock that runs down
D) For every given protein, the rate of molecular evolution is approximately constant in all evolutionary lineages |
|
Definition
For every given protein, the rate of molecular evolution is approximately constant in all evolutionary lineages |
|
|
Term
The two main features of any phylogenetic tree are:
A) The clades and the nodes
B) The topology and the branch lengths
C) The clades and the roots
D) The alignment and the bootstrap |
|
Definition
The topology and the branch lengths |
|
|
Term
Which of the following is a character-based phylogenetic algorithm?
A) Neighbor-joining
B) Kimura
C) Maximum likelihood
D) PAUP |
|
Definition
|
|
Term
Two basic ways to make a phylogenetic tree are distance based and character based. A fundamental difference between them is:
A) Distance-based methods essentially summarize relatedness across the length of protein or DNA sequences while character based methods do not
B) Distance based methods are only used for DNA data while character-based methods are used for DNA or protein data
C) Distance based methods use parsimony while character-based methods do not
D) Distance based methods have branches that are proportional to time while character-based methods do not. |
|
Definition
Distance-based methods essentially summarize relatedness across the length of protein or DNA sequences while character based methods do not |
|
|
Term
An example of an operational taxonomic unit (OTU) is:
A) Multiple sequence alignment
B) Protein sequence
C) Clade
D) Node |
|
Definition
|
|
Term
For a given pair of OTUs, which of the following is true?
A) The corrected genetic distance is greater than or equal to the proportion of substitutions
B) The proportion of substitutions is greater than or equal to the corrected genetic distance. |
|
Definition
The corrected genetic distance is greater than or equal to the proportion of substitutions |
|
|
Term
Transitions are almost always weighted more heavily than transversions
A) True
B) False |
|
Definition
|
|
Term
One of the most common errors in making and analyzing a phylogenetic tree is:
A) Using a bad MSA as input
B) Trying to infer the evolutionary relationships of genes (or proteins) in the tree
C) Trying to infer the age at which genes (or proteins) diverged from each other
D) Assuming that clades are monophyletic. |
|
Definition
|
|
Term
You have 200 viral DNA sequences of 500 residues each, and you want to know if there are any pairs that are identical (or nearly identical). Which of the following is the most efficient method to use?
A) BLAST
B) Maximum-likelihood phylogenetic tree analysis
C) Neighbor-joining phylogenetic tree analysis
D) Popset |
|
Definition
Neighbor-joining phylogenetic tree analysis |
|
|
Term
Evaluate the following statement: "Even within genes that undergo rapid evolution or sequence diversification, certain regions may still be subject to intense purifying selection"
A) TRUE
B) FALSE |
|
Definition
|
|
Term
The selective constraints on a gene can be studied given a multiple sequence alignment. Which of the following relationships between silent and non-silent mutations is suggestive of positive Darwinian selection?
A) dN/dS<<1
B) dS/dN>>1
C) dN/ dS >> 1
D) dN - dS =1 |
|
Definition
|
|
Term
Random fluctuation in allelic frequency that results solely from chance events is called:
A) Natural selection
B) Migration
C) Genetic Drift
D) Mutational load |
|
Definition
|
|
Term
Which of the following types of population-based events would result in dramatic, but nevertheless random fluctuation in allele frequencies?
A) Founder effect
B) Population bottleneck
C) Gene flow/ migration
D) All of the above |
|
Definition
|
|
Term
Phylogenenies are typically constructed assuming which of the following conditions?
A) Action of positive Darwinian selection
B) Mutations with Free recombination
C) Mutations with no recombination
D) Robust phylogenetic trees can be reconstructed so long as you have a robust MSA |
|
Definition
Robust phylogenetic trees can be reconstructed so long as you have a robust MSA |
|
|
Term
Evaluate the statement: "Variation + Differential reproduction + Hereditary = Natural Selection!"
A) TRUE
B) FALSE
C) It is not as simple as that; parts of the statement are true, although it omits several other criteria necessary to define natural selection. |
|
Definition
|
|
Term
Evaluate the statement: 'At the molecular level, evolution is a process of mutation and selection".
A) TRUE
B) FALSE |
|
Definition
|
|
Term
You have a favorite gene, and you want to determine in what tissues it is expressed. Which of the following resources is likely the most direct route to this information?
A) Unigene
B) Entrez
C) Pubmed
D) PCR |
|
Definition
|
|
Term
Which of the following databases is derived from mRNA information?
A) dbEST
B) PDB
C) OMIM
D) HTGS |
|
Definition
|
|
Term
Which of the following databases can be used to access text information about human diseases?
A) EST
B) PBD
C) OMIM
D) HTGS |
|
Definition
|
|
Term
What is the difference between RefSeq and GenBank?
A) RefSeq includes publicly available DNA sequences submitted from individual laboratories and sequencing projects
B) GenBank provides nonredundant curated data
C) GenBank sequences are derived from RefSeq
D) RefSeq sequences are derived from GenBank and provide nonredundant curated data |
|
Definition
RefSeq sequences are derived from GenBank and provide nonredundant curated data |
|
|
Term
If you wnat literature info, what is the best website to visit?
A) OMIM
B) Entrez
C) PubMed
D) PROSITE |
|
Definition
|
|
Term
Compare the use if Entrez and ExPASy to retrieve information about a protein sequence.
A) Entrez is likely to yield a more comprehensive search because GenBank has more data than EMBL
B) The search results are likely to be identical because the underlying raw data from GenBank and EMBL are the same.
C) The search results are likely to be comparable, but the SwissProt record from ExPASy will offer a different output format with distinct kinds of information.
D) None of the above |
|
Definition
The search results are likely to be comparable, but the SwissProt record from ExPASy will offer a different output format with distinct kinds of information. |
|
|
Term
My NCBI allows to:
A) Saving search queries and results
B) seeting up automatic searches with email alerts
C) Storing and organizing NCBI database records
D) tracking recent usage history
E) All of the above. |
|
Definition
|
|
Term
Which of the following databases alows for search and retrieval of collections of related sequences and alignments derived from population, phylogenetic, mutation and ecosystem studies that have been submitted to GenBank.
A) BioProject
B) BioSample
C) Popset
D) DbVar |
|
Definition
|
|
Term
Which of the following databases serves as a data repository and retrieval system for high-throughput functional genomic data generated by microarray and next-generation sequencing technologies?
A) dbSNP
B) dbVar
C) UniGene
D) GEO |
|
Definition
|
|
Term
Which of the following NCBI tools automatically detects homologs, including paralogs and orthologs, among the genes of 20 completely sequenced eukaryotic genomes.
A) UniGene
B) HomoloGene
C) Probe
D) BioSystems |
|
Definition
|
|
Term
PAM Matrices:
A) Are based upon global alignments of closely related proteins.
B) Was originally derived from 34 protein superfamilies
C) Define the score of two aligned residues i,j as 10 times the log of how likely it is to observe these two residues divided by the background probability of finding these AA by chance.
D) All of the above |
|
Definition
|
|
Term
Which of the following sentences best descrives the difference between a global alignment and a local alignment between two sequences?
A) Global alignment is usually used for DNA sequences, while local alignment is usually used for protein sequences
B) Global alignment has gaps, while local alignments does not
C) Global alignmnet finds the global maximum, while the local alignment finds the local max
D) Global alignment aligns the whole sequence, while local alignment that finds the best subsequence that aligns. |
|
Definition
Global alignment aligns the whole sequence, while local alignment that finds the best subsequence that aligns. |
|
|
Term
A global algorithm (such as the Needleman-Wunsch algorithm) is guaranteed to find an optimal alignment. Such an algorithm:
A) Puts the 2 proteins being compared into a matrix and finds the optimal score by exhaustively searching every possible combination of alignments
B) Puts the two proteins being compared into a matrix and finds the optimal score by iterative recursions
C) Puts the two proteins being compared into a matrix and finds the optimal alignment by finding optimal subpaths that define the nest alignment
D) Can be used for proteins but not for DNA sequences |
|
Definition
Puts the two proteins being compared into a matrix and finds the optimal alignment by finding optimal subpats that define the nest alignment |
|
|
Term
In a database search or in a pairwise alignment, sensitivity is defined as:
A) The ability of a search algorithm to find true positives (ie homologous sequences) and to avoid false positives (ie unrelated sequences having high similarity scores)
B) The ability of a search algorithm to find true positives (ie homologous sequences) and to avoid false positives (ie hologous sequences that are not reported)
C) The ability of a search algorithm to find true positives (ie homologous sequences) and to avoid false negatives ( ie unrelated sequences having high similarity)
D) The ability of a search algorithm to find true positives (Ie homologous sequences) and to avoid false negatives ( ie homologous sequences that are not reported) |
|
Definition
The ability of a search algorithm to find true positives (Ie homologous sequences) and to avoid false negatives ( ie homologous sequences that are not reported) |
|
|
Term
You have a DNA sequence that is 25KB long and you want to search against an organism specific genome database. What should be your optimum "word-size" setting if you are performing a BlastN
A) 2
B) 16-256
C) 1-2
D) 2-10 |
|
Definition
|
|
Term
Changing which of the following options can help in reducing false positives furing a BlastP search:
A) Increasing the E value
B) Increaseing the Word size
C) Applying condition or universal composition score adjustment
D) All of the above |
|
Definition
Applying condition or universal composition score adjustment |
|
|
Term
"Effective sequence search space" during a BLAST search refers to:
A) The product of effective query length and effective database length
B) The ratio of effective query length and database size
C) The size of the database searched
D) The size of the query itself. |
|
Definition
The product of effective query length and effective database length |
|
|
Term
Changing which of the following BLAST parameters would tend to yield fewer search results?
A) Turning off the low-complexity filter
B) Changing the expect value from 1 to 10
C) Raising the threshold value
D) Changing the scoring matrix from PAM 30 to PAM 70 |
|
Definition
Raising the threshold value |
|
|
Term
In a BLAST search the E Value is:
A) the number of alignments with scores greater than or equal to score S that are expected to occur by chance in a database search
B) Is related to a probability value, P= 1-e^E
C) Extremely low when the alignment is very good
D) All of the above |
|
Definition
|
|
Term
A fundamental difference between the raw score and "bit" score output in BLAST search is:
A) Bit scores are comparable between different searches because they are normalized to account for the use of different scoring matrices and different database sizes
B) Raw scores are comparable between different searches because they are normalized to account for the use of different scoring matrices and different database sizes
C) Bit scores are calculated from the substitution matrix
D) Raw score can be adjusted whereas the bit score is always fixed |
|
Definition
Bit scores are comparable between different searches because they are normalized to account for the use of different scoring matrices and different database sizes |
|
|
Term
A PSI-BLAST search is most useful when you want to do the following:
A) Find the rat ortholog of a human protein
B) Extend a database search to find additional proteins
C) Extend a database search to find additional DAN sequences
D) Use a pattern or signature to extend a protein search. |
|
Definition
Extend a database search to find additional proteins |
|
|
Term
A type phylogenetic "Clade" that includes teh common ancestor but not all of the ancestor's descendants is called as:
A) Monophyletic
B) Polyphyletic
C) Polytomic
D) Paraphyletic |
|
Definition
|
|
Term
Which of the following BLAST programs uses a signature of amino acids to find protein within a family?
A) PSI-BLAST
B) PHI-BLAST
C) MS_BLAST
D) WormBLAST |
|
Definition
|
|
Term
In a position-specific scoring matrix, the column headings can have 20 AA, and the rows can represent the residues of a query sequence. Within the matrix, the score for any give AA residue is assigned based on:
A) A PAM or BLOSUM matrix
B) Its frequency of occurence in an MSA
C) Its background frequency of occurrence
D) The core of its neighboring amino acids |
|
Definition
Its frequency of occurence in an MSA |
|
|
Term
As part of a PSI-BAST search, a score is assigned to alignment between a query sequence and a database match over some length (such as 50 AA residues). It is possible for this pairwise alignment to receive a higher or lower score over successive PSI-BLAST iterations, even though there is no change in which amino acid residues are aligned.
A) TRUE
B) FALSE |
|
Definition
|
|
Term
A position-specific scoring matrix is said to be "corrupted" when it incorporates a spurious sequence (i.e., a false positive result). Which of the following choices is the best way to reduce corruption?
A) Lower the E value
B) Remove the filtering
C) Use a Shorter query
D) Run fewer iterations |
|
Definition
|
|
Term
What is the main advantage of employing reverse position specific BLAST?
A) Reversing a query and/or asset of database sequences provides a set of null alignments from which the statistical significance of a PSI-BLAST search can be estimated
B)This method precomputers a large collection of position specific matrices, allowing a query to be rapidly assigned to a protein family
C) This methods allows critical concerved residues in the query sequence to be identified
D) This method facilitates the comparison of multiple PSSMs |
|
Definition
This method precomputers a large collection of position specific matrices, allowing a query to be rapidly assigned to a protein family |
|
|
Term
What capability does a profile hidden Markov Model offer that PSI-BLAST does not offer for protein queries?
A) A profile HMM can model the likelihood of insertion and deletions in aligned residues
B) A profile HMM can identify distantly related homologs that are not identified by standard BLAST searches
C) A profile HMM can estimate the probability of achieving particular scores for aligned residues across the length of a multiple sequence alignment
D) A profile HMM can model both protein relationships that are neither conserved or distant |
|
Definition
A profile HMM can model the likelihood of insertion and deletions in aligned residues |
|
|
Term
Which of the following BLAST algorithms utilizes the "CD" database for performing similarity searches?
A) PSI-BLAST
B) PHI-BLAST
C) Delta-BLAST
D) All of the above |
|
Definition
|
|
Term
In probability theory, probability associated with a given single event is called as:
A) Marginal prob
B) Joint prob
C) Conditional prob
D) Bayesian prob |
|
Definition
|
|
Term
Which of the following databases are based on profile HMMs:
A) PFAM
B) SMART
C) INTERPRO
D) All of the above |
|
Definition
|
|