Given a protein sequence, the blocks returned by either Block Searcher or IMPALA are from the Blocks database which contains conserved regions from large protein families.
To get blocks more specific to the query protein sequence, it would be better the make blocks from a subfamily of sequences rather than the entire family. The sequence to subfamily block webpage automatically determines a clade of closely related sequences and returns the blocks from those sequences.A Psiblast search (4 iterations, cutoff E-value for inclusion .002) is executed on the query sequence to find closely related sequences.
2. Selecting Closely Related SequencesSequences found from (1) are clumped at 90% and MOTIF is run on these clusters to find the conserved regions. Using these conserved regions, sequences are added until average information per residue decreases. The sequences selected tend to form a clade in the family to which the sequence belongs to.
3. Making the blocksThe regions identified by MOTIF for the selected sequences are returned.