setSequence()
setSequence(\Amelaye\BioPHP\Domain\Sequence\Entity\Sequence $oSequence) : mixed
Injection Sequence
Parameters
\Amelaye\BioPHP\Domain\Sequence\Entity\Sequence | $oSequence |
We use this class to manipulate Sequence() elements, most of the time taken from a database instance.
complement(string $sMoltypeUnfrmtd, string $sSequence = null) : string
Returns a string representing the genetic complement of a sequence.
string | $sMoltypeUnfrmtd | The type of molecule we are dealing with. If omitted, we work with "DNA" by default. |
string | $sSequence |
A string which is the genetic complement of the input string.
halfSequence(int $iIndex, string $sSequence = null) : string
Returns one of the two palindromic "halves" of a palindromic string.
int | $iIndex | Pass 0 to get he first palindromic half, pass any other number (e.g. 1) to get the second palindromic half. |
string | $sSequence | The sequence |
A string representing either the first or the second palindromic half of the string.
getBridge(string $string) : string
Returns the sequence located between two palindromic halves of a palindromic string.
Take note that the "bridge" as I call it, is not necessarily a genetic mirror or a palindrome.
string | $string | A palindromic or mirror sequence containing the bridge. |
expandNa(string $sSequence = null) : string
Returns the expansion of a nucleic acid sequence, replacing special wildcard symbols with the proper regular expression.
string | $sSequence | The sequence |
An "expanded" string where special metacharacters are replaced by the appropriate regular expression. For example, an N or X is replaced by the dot (.) meta-character, an R is replaced by [AG], etc.
molwt(string $sLimit = "upperlimit", string $sSequence = null, string $sMolType = null, int $iNALen = null) : float
Computes the molecular weight of a particular sequence.
string | $sLimit | Upper or Lowerlimit |
string | $sSequence | The sequence |
string | $sMolType | DNA or RNA |
int | $iNALen | Length of the sequence |
The molecular weight, upper or lower limit
subSeq(int $iStart, int $iCount, string $sSequence = null) : string
Creates a new sequence object with a sequence that is a substring of another.
int | $iStart | The position in the original sequence from which we will begin extracting the subsequence; the position is expressed as a zero-based index. |
int | $iCount | The number of "letters" to include in the subsequence, starting from the position specified by the $start parameter. |
string | $sSequence | The sequence |
String sequence.
patPos(string $sPattern, string $sOptions = "I", string $sSequence = null) : array
Returns a two-dimensional associative array where each key is a substring matching a given pattern, and each value is an array of positional indexes which indicate the location of each occurrence of the substring (needle) in the larger string (haystack). This DOES NOT allow for pattern overlaps.
string | $sPattern | The pattern to locate |
string | $sOptions | If set to "I", pattern-matching will be case-insensitive. |
string | $sSequence | The sequence |
Value example: ( "PAT1" => (0, 17), "PAT2" => (8, 29) )
patPoso(string $sPattern, string $sOptions = "I", int $iCutPos = 1, string $sSequence = null) : array
Similar to patPos() except that this allows for overlapping patterns.
Return value format: (index1, index2, ... ) Return value sample: ( 0, 8, 17, 29)
string | $sPattern | The pattern to locate |
string | $sOptions | If set to "I", pattern-matching will be case-insensitive. Passing anything else would cause it to be case-sensitive. |
int | $iCutPos | A non-negative integer specifying where search for the next pattern will resume, relative to the current matching substring. |
string | $sSequence | The sequence to analyze |
One-dimensional array of the form: ( position1, position2, position3, ... ) where position is a zero-based index indicating the location of the substring within the larger sequence. Thus, if substring is found at the very beginning of sequence, its position is equal to zero (0).
patFreq(string $sPattern, string $sSequence, string $sOptions = "I") : array
Returns a one-dimensional associative array where each key is a substring matching the given pattern, and each value is the frequency count of the substring within the larger string.
Return value example: ( "GAATTC" => 3, "ATAT" => 4, ... )
string | $sPattern | The pattern to search for and tally. |
string | $sSequence | Sequence |
string | $sOptions | If set to "I", pattern-matching and tallying will be case-insensitive. Passing anything else would cause it to be case-sensitive. |
The function returns an array of the form: ( substring1 => frequency1, substring2 => frequency2, ... )
findPattern(string $sPattern, string $sSequence = null, string $sOptions = "I") : array
Returns a one-dimensional array enumerating each occurrence or instance of a given pattern in a larger string or sequence. This returns the actual substring (that matches the pattern) itself.
string | $sPattern | The pattern to search for |
string | $sSequence | The sequence |
string | $sOptions | If set to "I", pattern-matching will be case-insensitive. Passing anything else would cause the pattern-matching to be case-sensitive. |
A one-dimensional array
symFreq(string $sSymbol, string $sSequence = null) : int
Returns the frequency of a given symbol in the sequence property string. Note that you can pass this a symbol argument which may be not be part of the sequence's alphabet.
In this case, the method will simply return zero (0) value.
string | $sSymbol | The symbol whose frequency in a sequence we wish to determine. |
string | $sSequence | The sequence |
The frequency (number of occurrences) of a particular symbol in a sequence string.
getCodon(int $iIndex, string $sSequence = null, int $iReadFrame) : string
Returns the n-th codon in a sequence, with numbering starting at 0.
int | $iIndex | The index number of the codon. |
string | $sSequence | The sequence |
int | $iReadFrame | The reading frame, which may be 0, 1, or 2 only. If omitted, this is set to 0 by default. |
The n-th codon in the sequence.
translate(string $sSequence = null, int $iReadFrame, int $iFormat = 1) : string
Translates a particular DNA sequence into its protein product sequence, using the given substitution matrix.
string | $sSequence | The sequence |
int | $iReadFrame | The reading frame (0, 1, or 2) to be used in translating a nucleic sequence into a protein. A value of 0 means that the first codon would start at the first "letter" in the sequence, a value of 1 means that the second codon would start the second "letter" in the sequence, and so on. When omitted, this argument is set to reading frame 0 by default. |
int | $iFormat | This may be passed the value 1 or 3 and determines the format of the output string. Passing 1 would cause translate() to output a string made up of single-letter amino acid symbols strung together without any space in between. Passing 3 would output a string made up of three-letter amino acid symbols separated by a space. |
charge(string $sAminoSeq) : string
Translates an amino acid sequence into its equivalent "charge sequence".
Function charge() accepts a string of amino acids in single-letter format and outputs a string of charges in single-letter format also. A for acidic, C for basic, and N for neutral.
string | $sAminoSeq | A string representing an amino acid sequence (e.g. GAVLIFYWKRH). If omitted, this is set to the sequence property of the "calling" Seq object. If the latter is not set either, the function returns the boolean value of FALSE. |
A string where each amino acid "letter" is replaced by A (if amino acid is acidic), C (if amino acid is basic), or N (if amino acid is neutral), e.g. ACNNCCNANCCNA.
chemicalGroup(string $sAminoSeq) : string
Returns a string of symbols from an 8-letter alphabet: A, L, M, R, C, H, I, S.
Chemical groups: L - GAVLI, H - ST, M - NQ, R - FYW, S - CM, I - P, A - DE, C - KRH, - , X - X
string | $sAminoSeq | A string representing an amino acid chain (e.g. GAVLI). If omitted, this is set to the sequence property of the "calling" Seq object. If the latter is not set either, the function returns the boolean value of FALSE. |
A string where each amino acid "letter" is replaced by one of the following: A (acidic group), L (aliphatic group), M (amide group), R (aromatic group), C (basic group), H (hydroxyl), I (iminio group), S (sulfur group).
translateCodon(string $sCodon, int $iFormat = 3) : string
Translates a single codon into an amino acid.
string | $sCodon | A three-letter nucleic acid sequence (each letter can be A, U, G, or C) which translates into a single amino acid residue. |
int | $iFormat | This may be passed the value 1 or 3 and determines the format of the output string. When omitted, $format is set to 3 by default. |
When $format is passed a value of 1, the function returns a single letter. When $format is passed a value of 3, the function returns a string of three letters. The return value represents a single amino acid residue.
isMirror(string $sSequence = null) : bool
Returns TRUE if the given sequence or string is a "genetic mirror" which is the same as a "string palindrome", i.e., a sequence that "looks" the same when read backwards.
Definition of terms: MIRROR: The equivalent of a string palindrome in programming terms. Comes in two varieties -- ODD-LENGTH and EVEN-LENGTH. The strict biological definition of mirrors are EVEN-LENGTH only. MIRROR SEQUENCE: seq1-[X]-seq2, where X is an optional nucleotide base (A, G, C, or T). Seq1 and Seq2 are called the complementary sequences or halves. For our purposes, we shall call [X] as the "bridge".
string | $sSequence | A sequence which we want to test if it is a mirror or not. |
findMirror(string $sSequence = null, int $iPallen1 = null, int $iPallen2 = null, string $sOptions = "E") : array
Returns a three-dimensional associative array listing all mirror substrings contained within a given sequence, and their location (expressed as a zero-based index number).
string | $sSequence | The sequence which will be searched by the method for any occurrences of mirrors. If omitted, this is set to the sequence property of the current Seq object. |
int | $iPallen1 | The length of the shortest mirror to look for. |
int | $iPallen2 | The length of the longest mirror to look for. |
string | $sOptions | May be "E" or "O" or "A". If "E" is passed, then the method only looks for mirrors with even lengths. If "O" is passed, the method only looks for mirrors with odd lengths. If "A" is passed, then method looks for all mirrors (odd and even lengths). If omitted, this is set to "E" by default. |
| bool 3D assoc array: ( [2] => ( ("AA", 3), ("GG", 7) ), [4] => ( ("GAAG", 16) ) )
isPalindrome(string $sSequence = null) : bool
Tests if a given sequence is a "genetic palindrome" (as opposed to a "string palindrome"). A "genetic palindrome" is one where the ends of a sequence are reverse complements of each other.
For mirror repeats, we allow strings with both ODD and EVEN lengths.
string | $sSequence | A sequence which we want to test if it is a genetic palindrome or not. |
TRUE if the given string is a genetic palindrome, FALSE otherwise.
findPalindrome(string $sSequence, int $iSeqLen, int $iPalLen) : bool|array
Returns a two-dimensional array containing palindromic substrings found in a sequence, and their location, in terms of zero-based indices. E.g. ( ("ATGttCAT", 2), ("ATGccccccCAT", 18), ... ) CASES: 1) seqlen is not set, pallen is not set. - return FALSE (function error) 2) seqlen is set, pallen is set.
3) seqlen is set, pallen is not set. 4) seqlen is not set, pallen is set.
string | $sSequence | The sequence to be searched by the method for any genetic palindromes. If omitted, this is set to the sequence property of the current Seq object. |
int | $iSeqLen | The length of the palindromic substring within $sSequence. If omitted, the method searches for palindromes of whatever length. |
int | $iPalLen | The length of one of two palindromic edges in a palindromic substring within $haystack. |
A two-dimensional array of the form: ((palindrome1, position1), (palindrome2, position2), ...)