Skip to content Skip to sidebar Skip to footer

How Can I Print Second And Last Three Lines From Multiple Text Files, In Awk Or Python?

Using awk, I am having difficulty trying to print the second and last three lines from multiple text files. In addition, I would like to direct the output to a text file. Any help

Solution 1:

This has the advantage that the whole file is not held in memory.

awk 'NR == 2 {print}; {line1 = line2; line2 = line3; line3 = $0} END {print line1; print line2; print line3}' files*

Edit:

The following uses some code from the gawk manual that is portable to other versions of AWK. It provides per-file processing. Note that gawk version 4 provides BEGINFILE and ENDFILE rules.

#!/usr/bin/awk -ffunctionbeginfile (file) {
    line1 = line2 = line3 = ""
}

functionendfile (file) {
    print line1; print line2; print line3
}

FILENAME != _oldfilename \
     {
         if (_oldfilename != "")
             endfile(_oldfilename)
         _oldfilename = FILENAME
         beginfile(FILENAME)
     }

     END   { endfile(FILENAME) }

FNR == 2 {
    print
}

{
    line1 = line2; line2 = line3; line3 = $0
}

Save that as a file, perhaps calling it "fileparts". Then do:

chmod u+x fileparts

Then you can do:

./fileparts file1 file2 anotherfile somemorefiles*.txt

and it will output the second line and the last three lines of each file in one set of output.

Or you can modify it to output to separate files or you can use a shell loop to output to separate files:

for file in file1 file2 anotherfile somemorefiles*.txt
do
    ./fileparts "$file" > "$file.out"done

You can name the output files however you like. They will be text files.

Solution 2:

To avoid reading the entire file into memory at once, use a deque with a maxlen of 3 to create a rolling buffer for capturing the last 3 lines:

from collections import deque
defget2ndAndLast3LinesFrom(filename):
    withopen(filename) as infile:
        # advance past first linenext(infile)
        # capture second line
        second = next(infile)
        # iterate over the rest of the file a line at a time, saving the final 3
        last3 = deque(maxlen=3)
        last3.extend(infile)        
        return second, list(last3)

You could generalize this approach to a function that would take any iterable:

def lastN(n, seq):
    buf = deque(maxlen=n)
    buf.extend(seq)
    return list(buf)

Then you can create different length "last-n" functions using partial:

from functools import partial
last3 = partial(lastN, 3)

print last3(xrange(100000000)) # or just use range in Py3

Solution 3:

If you aren't wedded to Python or AWK for the implementation, you can do something very straightforward using your shell and the standard head/tail utilities.

for file in"$@"; dohead -n2 "$file" | tail -n1
    tail -n3 "$file"done

You can also wrap this in a function or place it in a script, and then call it from within Python or AWK with subprocess.check_output() if you really want, but in such cases it may just be easier to use native methods rather than spawning an external process.

Solution 4:

This would work, but it does load the entire file in memory, which might not be ideal if your files are very large.

text = filename.readlines()

print text[2] # print second linefor i inrange(1,4): # print last three linesprint text[-i]

There are also some good alternatives discussed here.

Solution 5:

i don't know about awk but if you are using Python i guess you will need something like this

inf = open('test1.txt','rU')
lines = inf.readlines()
outf = open('Spreadsheet.ods','w')
outf.write(str(lines[1]))
outf.write(str(lines[-3]))
outf.write(str(lines[-2]))
outf.write(str(lines[-1]))
outf.close()
inf.close()

Post a Comment for "How Can I Print Second And Last Three Lines From Multiple Text Files, In Awk Or Python?"