Bioinformatics

Multiple Sequence Alignment

Finding missense variants that help explain the differences in dentition in Neanderthals and modern humans using a multiple sequence alignment (R seqinr and msa package).

Greedy Algorithm and Sequence Assembly

The greedy algorithm can be used to assemble longer sequences from shorter fragments from sequencing, not always giving the optimal result.

PAX9 Variants Random Forest Model

Pax9 is a gene belonging to the family of paired-box genes that encode transcription factors involved in organogenesis. Defects in Pax9 can lead to various types of cancer as well as a condition known as oligodontia which causes missing teeth in individuals. A random forest model was used to predict the phenotypic effect of variants of PAX9.

Hidden Markov Model

In a hidden markov model, the markov chain is hidden, we can only observe outcome values. Therefore, additional concepts such as observed states and emission probabilities are explained.

Bioinformatics Glossary

What do all the acronyms like 'NG_', 'NM_', 'ENSG', etc. mean? What are all these different files (FASTA, VCF, etc.) used for? And more !