Wallpaper scraper/grabber for a certain website...
Find a file
ryuslash 7f8dfa1d30 Sorting, multi category, multi resolution
After a file has been downloaded a callback function can now be called.
The callback function I call checks to see if the resolution of the image appears in the collection of resolutions that has been entered in the configuration file and deletes/moves accordingly.
If a file can not be read (which I have noticed happens sometimes), it is removed, not copied and not archived so that it can be retried later.
4grab got a new command-line option, -s --sorter, to sort out old images, running python sorter.py has the same effect, but this seemed pretties.
theoretically multiple categories could now be entered into the configuration file seperated by ',', but this hasn't been tested yet.
mutliple resolutions could be entered into the configuration file, seperated by ',' like so: 1680x1050,1920x1200.
Configuration now checks to see if all the necessary properties are available in the configuration file, if one is missing, it tries to create it.
2010-03-19 00:18:04 +01:00
.gitignore Added --category 2010-02-11 21:20:16 +01:00
4grab.py Sorting, multi category, multi resolution 2010-03-19 00:18:04 +01:00
config.py Sorting, multi category, multi resolution 2010-03-19 00:18:04 +01:00
COPYING Added license info and README 2010-02-09 02:45:56 +01:00
download.py Sorting, multi category, multi resolution 2010-03-19 00:18:04 +01:00
htmlparser.py Added license info and README 2010-02-09 02:45:56 +01:00
progressbar.py Removed some comments 2010-02-12 00:18:34 +01:00
README Added license info and README 2010-02-09 02:45:56 +01:00
sorter.py Sorting, multi category, multi resolution 2010-03-19 00:18:04 +01:00

4grab - a utility that download pictures from a certain website.

4grab has been written to help me download wallpapers from a certain website, though it should easily be possible to use it for a general purpose image-downloader for that website.

4grab downloads pages 0-10 from the given category and parses these pages looking for links to threads. Afterwards it downloads these thread pages and starts looking in those for image links, skipping one each time because images there are linked to twice.
Finally it goes through all the collected image links and checks to see whether it already exists on the disk and if not, downloads it.

4grab is written in Python and therefore needs the python interpreter (http://www.python.org) to be installed on your system. It is being tested on a Fedora 12 machine with Python 2.6.2 installed, and a Windows XP machine with Python 2.6.4 installed.

v0.2.1 has been reported to work under Windows 7 as well.