An Optimal Algorithm for Finding Length-Constrained, Maximum-Density Segments of a Sequence

We study an abstract optimization problem arising from biomolecular sequence analysis. For a sequence A of pairs (ai,wi) for i = 1, ..., n and wi>0, a segment A(i,j) is a consecutive subsequence of A starting with index i and ending with index j. The width of A(i,j) is w(i,j) = sum{i <= k <= j} wk, and the density is (sum{i <= k <= j} ak)/ w(i,j). The maximum-density segment problem takes A and two values L and U as input and asks for a segment of A with the largest possible density among those of width at least L and at most U. When U is unbounded, we provide a relatively simple, O(n)-time algorithm, improving upon the O(n log L)-time algorithm by Lin, Jiang and Chao. When both L and U are specified, there are no previous nontrivial results. We solve the problem in O(n) time if wi=1 for all i, and more generally in O(n+n(U-L+1)) time when wi >= 1 for all i.