Due to bad 3'-end quality, some datasets cannot be pair-end assembled.
So add an option to force analyses even without pair-end assembly.
Possible for dissimilarity based clustering:
- optimize length trimming based on maxEE/quality
- if too low number of pair-end assembled seq and force set on, skip pair-end at trim step
- unique et all pair's dissimilarities per fragment
- resolve the unique set of assembled dereplicated sequences from both strand (and add a xx length gap between both fragments)
- averaged dissimilarity for all pair of assembled dereplicated sequences weighted by length
- clustering (sumaclust or MCL) on average dissimilarity
- vsearch usearch-global with no gap extension penalty for the query sequence (--gapext 2IT/0IQ/1E) --> to test