Motif Search: Difference between revisions

From Algorithm Wiki
Jump to navigation Jump to search
(Created page with "== Problem Description== == Bounds Chart == 350px == Step Chart == 350px == Improvement Table == {| class="wikitable" style="text-align:center;" width="100%" !width="20%" | Complexity Classes !! width="40%" | Algorithm Paper Links !! width="40%" | Lower Bounds Paper Links |- | rowspan="1" | Exp/Factorial | | |- | rowspan="1" | Polynomial > 3 | | |- | rowspan="1" | Cubic | | |- | rowspan="1" |...")
 
No edit summary
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Problem Description==
{{DISPLAYTITLE:Motif Search (Motif Search)}}
== Description ==  


Motif search is the problem of identifying motifs, recurring or conserved patterns, in the strings (typically biological sequence data sets).


== Bounds Chart ==
== Parameters ==  
[[File:Motif_SearchBoundsChart.png|350px]]


== Step Chart ==
$n$: size of set of input strings
[[File:Motif_SearchStepChart.png|350px]]
 
$m$: size of input strings
 
$k$: length of substrings
 
$\sigma$: function $V(k, m)$ defined as the number of $k$-mers that are at most $m$ Hamming distance from the motif space
 
== Table of Algorithms ==  
 
{| class="wikitable sortable"  style="text-align:center;" width="100%"
 
! Name !! Year !! Time !! Space !! Approximation Factor !! Model !! Reference


== Improvement Table ==
{| class="wikitable" style="text-align:center;" width="100%"
!width="20%" | Complexity Classes !! width="40%" | Algorithm Paper Links !! width="40%" | Lower Bounds Paper Links
|-
| rowspan="1" | Exp/Factorial
|
|
|-
|-
| rowspan="1" | Polynomial > 3
 
|
| [[Lawrence, Reilly (Motif Search Motif Search)|Lawrence, Reilly]] || 1990 || $O(nm)$ || $O(nm)$ ||  || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/2184437 Time & Space]
|
|-
|-
| rowspan="1" | Cubic
| [[Lawrence Gibbs Sampling (Motif Search Motif Search)|Lawrence Gibbs Sampling]] || 1993 || $O(nm)$ || $O(n + m)$ ||  || Deterministic || [https://science.sciencemag.org/content/262/5131/208 Time]
|
|
|-
|-
| rowspan="1" | Quadratic
| [[MotifSampler  (Motif Search Motif Search)|MotifSampler ]] || 2001 || $O(nm)$ || $O(n + m)$ ||  || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/12015892 Time]
|
|
|-
|-
| rowspan="1" | nlogn
| [[Speller (Motif Search Motif Search)|Speller]] || 1998 || $O(mn^{2} \sigma)$ || $O(mn^{2}/w)$ || Exact || Deterministic || [https://link.springer.com/chapter/10.1007/BFb0054337 Time] & [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|
|
|-
|-
| rowspan="1" | Linear
| [[Mitra (Motif Search Motif Search)|Mitra]] || 2002 || $O(k nm \sigma)$ || $O(mnk)$ || Exact || Deterministic || [https://pubmed.ncbi.nlm.nih.gov/12169566/ Time] & [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|
|
|-
|-
| rowspan="1" | logn
| [[Census (Motif Search Motif Search)|Census]] || 2003 || $O(k nm \sigma)$ || $O(mnk)$ || Exact || Deterministic || [https://link.springer.com/chapter/10.1007/978-3-540-45078-8_5 Time] & [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|
|-
|
| [[Risotto (Motif Search Motif Search)|Risotto]] || 2006 || $O(mn^{2} \sigma)$ || $O(mn^{2})$ || Exact || Deterministic || [https://link.springer.com/chapter/10.1007/11682462_69 Time] & [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-|}
|-
| [[PMS (Motif Search Motif Search)|PMS]] || 2007 || $O(nm^{2} \sigma)$ || $O(m^{2} n)$ || Exact || Deterministic || [https://ieeexplore.ieee.org/abstract/document/4359890 Time] & [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2966288/pdf/1471-2105-11-S8-S1.pdf Space]
|-
| [[Roth AlignACE (Motif Search Motif Search)|Roth AlignACE]] || 1998 || $O(nm)$ || $O(n + m)$ ||  || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/9788350 Time]
|-
| [[Helden Oligo-Analysis (Motif Search Motif Search)|Helden Oligo-Analysis]] || 1998 || $O(mn)$ || $O(m)$ || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/9719638 Time]
|-
| [[van Helden J; Rios AF; Collado-Vides J (Motif Search Motif Search)|van Helden J; Rios AF; Collado-Vides J]] || 2000 || $O(mn)$ || $O(m)$ || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102821/ Time]
|-
| [[Tompa M (Motif Search Motif Search)|Tompa M]] || 1999 || $O(mn)$ || $O(m^{2})$ || Exact || Deterministic || [https://www.aaai.org/Papers/ISMB/1999/ISMB99-030.pdf Time]
|-
| [[Sinha S; Tompa M YMF (Yeast Motif Finder) (Motif Search Motif Search)|Sinha S; Tompa M YMF (Yeast Motif Finder)]] || 2000 || $O(n^{0.{6}6} m)$ || $O(m)$ || Exact || Deterministic || [https://www.ncbi.nlm.nih.gov/pubmed/10977095 Time]
|-
| [[Bailey TL; Elkan C MEME (Motif Search Motif Search)|Bailey TL; Elkan C MEME]] || 1995 || $O(n^{2}m^{2})$ || $O(mn)$ || Exact || Deterministic || [https://link.springer.com/article/10.1007/BF00993379 Time]
|-
| [[Sagot M (Motif Search Motif Search)|Sagot M]] || 1988 || $O(n \log(n)$ m^{1.{4}5}) || $O(mn^{2}/w)$ || Exact || Deterministic || [https://link.springer.com/chapter/10.1007/BFb0054337 Time & Space]
|-
| [[Liang Cwinnower (Motif Search Motif Search)|Liang Cwinnower]] || 2003 || $O(nm^{0.5})$ || $O(m^{2})$ || Exact || Deterministic || [https://www.worldscientific.com/doi/10.1142/S0219720004000466 Time]
|-
| [[Kingsford (Motif Search Motif Search)|Kingsford]] || 2006 || $O(mn)$ || $O(m^{2}n^{2})$ || Exact || Deterministic || [https://link.springer.com/chapter/10.1007/11780441_22 Time]
|-
|}
 
== Time Complexity Graph ==
 
[[File:Motif Search - Time.png|1000px]]

Latest revision as of 10:11, 28 April 2023

Description

Motif search is the problem of identifying motifs, recurring or conserved patterns, in the strings (typically biological sequence data sets).

Parameters

$n$: size of set of input strings

$m$: size of input strings

$k$: length of substrings

$\sigma$: function $V(k, m)$ defined as the number of $k$-mers that are at most $m$ Hamming distance from the motif space

Table of Algorithms

Name Year Time Space Approximation Factor Model Reference
Lawrence, Reilly 1990 $O(nm)$ $O(nm)$ Deterministic Time & Space
Lawrence Gibbs Sampling 1993 $O(nm)$ $O(n + m)$ Deterministic Time
MotifSampler 2001 $O(nm)$ $O(n + m)$ Deterministic Time
Speller 1998 $O(mn^{2} \sigma)$ $O(mn^{2}/w)$ Exact Deterministic Time & Space
Mitra 2002 $O(k nm \sigma)$ $O(mnk)$ Exact Deterministic Time & Space
Census 2003 $O(k nm \sigma)$ $O(mnk)$ Exact Deterministic Time & Space
Risotto 2006 $O(mn^{2} \sigma)$ $O(mn^{2})$ Exact Deterministic Time & Space
PMS 2007 $O(nm^{2} \sigma)$ $O(m^{2} n)$ Exact Deterministic Time & Space
Roth AlignACE 1998 $O(nm)$ $O(n + m)$ Deterministic Time
Helden Oligo-Analysis 1998 $O(mn)$ $O(m)$ Exact Deterministic Time
van Helden J; Rios AF; Collado-Vides J 2000 $O(mn)$ $O(m)$ Exact Deterministic Time
Tompa M 1999 $O(mn)$ $O(m^{2})$ Exact Deterministic Time
Sinha S; Tompa M YMF (Yeast Motif Finder) 2000 $O(n^{0.{6}6} m)$ $O(m)$ Exact Deterministic Time
Bailey TL; Elkan C MEME 1995 $O(n^{2}m^{2})$ $O(mn)$ Exact Deterministic Time
Sagot M 1988 $O(n \log(n)$ m^{1.{4}5}) $O(mn^{2}/w)$ Exact Deterministic Time & Space
Liang Cwinnower 2003 $O(nm^{0.5})$ $O(m^{2})$ Exact Deterministic Time
Kingsford 2006 $O(mn)$ $O(m^{2}n^{2})$ Exact Deterministic Time

Time Complexity Graph

Motif Search - Time.png