Skip to content Skip to sidebar Skip to footer

How To Get Filename From Stdin

I am writing a script and i am running it from the console like this cat source_text/* | ./mapper.py and i would like to get the filename of each file reading at the time. Source t

Solution 1:

You cannot do that directly, but fileinput module can help you.

You just have to call you script that way:

./mapper.py source_text/*

And change it that way:

import fileinput
...

# Read pairs as lines of input from STDIN
for line in fileinput.input():
    ...

Then the name of the file being processed is available as fileinput.filename(), and you can also have access the the number of the line in current file as fileinput.filelineno() and still other goodies...

Solution 2:

That is not possible. You can modify your program to read directly from the files like this:

import sys
import re

# re is for regular expressions
pattern = re.compile("[a-zA-Z][a-zA-Z0-9]*",
                     re.MULTILINE | re.DOTALL | re.IGNORECASE)
for filename in sys.argv[1:]:
    withopen(filename, "rU") as f:
        for line in f.readlines():
            if pattern.search(line) isnotNone:
                print filename, line,

Then you can call it with:

$ ./grep_files.py source_text/*

Solution 3:

If you use this instead of cat:

grep -r '' source_text/ | ./mapper.py

The input for mapper.py will be like:

source_text/answers.txt:42
source_text/answers.txt:42
source_text/file1.txt:Hello world

You can then retrieve the filename using:

for line in sys.stdin:
    filename, line = line.split(':', 1)
    ...

However Python is more than capable to iterate over files in a directory and reading them line-by-line, for example:

for filename inos.listdir(path):
    for line inopen(filename):
        ...

Post a Comment for "How To Get Filename From Stdin"