In Python, the typical way to expand a wildcard is with the glob module.
either: glob.glob to return a list, or glob.iglob to return an iterator (which may be preferable if a large list is expected).
Here's a solution that uses the argparse and glob modules:
import argparse
from glob import glob
def main(file_names):
print file_names
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("file_names", nargs='*')
#nargs='*' tells it to combine all positional arguments into a single list
args = parser.parse_args()
file_names = list()
#go through all of the arguments and replace ones with wildcards with the expansion
#if a string does not contain a wildcard, glob will return it as is.
for arg in args.file_names:
file_names += glob(arg)
main(file_names)
One caveat is that I have noticed that python and bash don't sort the expanded lists in the same way, so if for some reason you need deterministic sorting of input, you should sort the resulting list yourself.
see also: http://stackoverflow.com/questions/12501761/passing-multple-files-with-asterisk-to-python-shell-in-windows
This is not quite correct behavior, in my opinion. I feel if the user specifies a filename and the file does not exist, the program should raise an error. If you pass a string that contains no wildcards and the file does not exist, glob will return an empty list, meaning the nonexistent file will be silently ignored. So, in my opinion, the right way is to check if the filename contains any glob tokens ("*", "?", or "[") and only run it through glob if it does.
ReplyDeleteI really wish the argparse module had an option to handle globbing automatically...
Hi furrykef,
DeleteThanks for the comment. You're right, I didn't think of that. I need to update this code to take that into account.
In addition to what you said, there is another bug I noticed. Someone might want to pass in literals containing "*", or other characters that glob will remove ("*" can be in linux filenames). The code, as written, will run everything through glob, and there's no way of telling it not to. When the shell is processing wildcards, you can use quotes to ignore wildcards. Using the code I have here, it will process those as globs in the program, even if the shell doesn't.