How To Get Filename From Stdin
I am writing a script and i am running it from the console like this cat source_text/* | ./mapper.py and i would like to get the filename of each file reading at the time. Source t
Solution 1:
You cannot do that directly, but fileinput module can help you.
You just have to call you script that way:
./mapper.py source_text/*
And change it that way:
import fileinput
...
# Read pairs as lines of input from STDIN
for line in fileinput.input():
...
Then the name of the file being processed is available as fileinput.filename()
, and you can also have access the the number of the line in current file as fileinput.filelineno()
and still other goodies...
Solution 2:
That is not possible. You can modify your program to read directly from the files like this:
import sys
import re
# re is for regular expressions
pattern = re.compile("[a-zA-Z][a-zA-Z0-9]*",
re.MULTILINE | re.DOTALL | re.IGNORECASE)
for filename in sys.argv[1:]:
withopen(filename, "rU") as f:
for line in f.readlines():
if pattern.search(line) isnotNone:
print filename, line,
Then you can call it with:
$ ./grep_files.py source_text/*
Solution 3:
If you use this instead of cat:
grep -r '' source_text/ | ./mapper.py
The input for mapper.py will be like:
source_text/answers.txt:42
source_text/answers.txt:42
source_text/file1.txt:Hello world
You can then retrieve the filename using:
for line in sys.stdin:
filename, line = line.split(':', 1)
...
However Python is more than capable to iterate over files in a directory and reading them line-by-line, for example:
for filename inos.listdir(path):
for line inopen(filename):
...
Post a Comment for "How To Get Filename From Stdin"