Difference between revisions of "Tutorial"
(→Align) |
|||
Line 84: | Line 84: | ||
With all the alignment types you can decide to use a blosum substitution matrix of 62, 45, 50 or 80. | With all the alignment types you can decide to use a blosum substitution matrix of 62, 45, 50 or 80. | ||
− | for more detail use the option -h with the subali application. | + | for more detail use the option -h with the subali application or see the [[AlignTest]] code. |
= Energy functions implementation = | = Energy functions implementation = |
Revision as of 20:06, 18 August 2014
Contents
Biopool library
The Biopool class implementation follows the composite design pattern and for a complete description of the class hierarchy we recommend to see the [Doxygen documentation]. Without going into implementation details a Protein object is just a container for vectors representing chains. Each vector has 2 elements: the Spacer and the Ligand Set. The Spacer is the container for AminoAcid objects whereas the LigandSet is a container for all other molecules and ions, including DNA/RNA chains. Ultimately all molecules, both in the Spacer and in the LigandSet are collections of Atom objects. The main feature in Biopool is that each AminoAcid object in the Spacer is connected to its neighbours by means of one rotational vector plus one translational vector. This implementation make ease the modification of the protein structure and lot of functions were implemented to modify/perturbate/transformate the residue relative position in an efficient way. Rotation and Translation vectors:
The object representation look like that:
Victor includes different packages: Biopool, Lobo and Energy. Every package is identified by a directory, starting with a capital letter, in the main Victor path. Inside each package you will find the Source folder containing the classes code and the APPS directory including useful utilities. In the main path you will find the data folder containing symbolic links to data files used by singular packages. In the main Victor path you should also find the bin directory containing most important programs simply copied from the APPS folders.
Parsing a PDB file (PdbLoader)
Biopool uses the PdbLoader class to load PDB files. By default it loads all standard residues and hetero atoms excluding nucleotides and water molecules. When possible it also tries to place hydrogen atoms to every amino acid included in the spacer and determine the secondary structure with the DSSP algorithm. The simplest way to load a PDB into a Protein object is:
#include <PdbLoader.h>
#include <Protein.h>
#include <iostream>
int main( int argc, char* argv[] ) {
string inputFile = "MyPdbFile.pdb";
ifstream inFile( inputFile.c_str() );
PdbLoader pl(inFile); // creates the PdbLoader object
Protein prot;
prot.load( pl ); // creates the Protein object
}
Modify the structure
Add hydrogen atoms
Get the secondary structure
There are 3 different ways in Victor to get the secondary structure. The first (inaccurate) is just parsing the HELIX and SHEET fields in the PDB file. The second method is to infer the secondary structure from torsional angles. The last choice is to use an implementation of the DSSP algorithm, consider that you can find little (negligible) differences compared to the original algorithm but it is the most accurate way to calculate the secondary structure.
Align
To initiate the alignment process you need to put the target and the template fasta sequences in a file, that will be the input file for the application.
>target VLEEIAKDHGEALTI.... >template AFQVTSIPTLILFQ....
Then depending of what type of alignment you want, you can choose between
sequence to sequence profile to sequence profile to profile
Also you can choose between the alignment algorithm to use
local global freeshift
if you choose profile to sequence or profile to profile type you can decide to use a specific Weighting scheme
PSIC Henikoff SecDivergence
And also choose a scoring function
CrossProduct LogAverage DotPFreq DotPOdds EDistance Pearson JensenShannon AtchleyDistance AtchleyCorrelation
With all the alignment types you can decide to use a blosum substitution matrix of 62, 45, 50 or 80.
for more detail use the option -h with the subali application or see the AlignTest code.