How Would I Go About Parsing A Text File Of Thousands Of DNA Bases?
Here's what I would have, I would have a massive text file of a bunch of dna bases (A, T, C, G) and what I would like to do is take every 60 characters (arbitrary) and put it on a
Solution 1:
data = 'GAGACAGAGTCTCACTCTGTTGCACAGGCTGGAGTGCAGTGGCACAATCTCTGCTCACTGCAACCTCCTC'
chunk_size = 5
overlap = 2
for pos in range(0, len(data), chunk_size - overlap):
print(data[pos:pos+chunk_size])
The results:
GAGAC
ACAGA
GAGTC
TCTCA
CACTC
TCTGT
...
Post a Comment for "How Would I Go About Parsing A Text File Of Thousands Of DNA Bases?"