How to use python to download 100 cat pics in one go
This Python Hack is Purr-fect!
Alright, so we all know cats are awesome, right? But this blog post isn’t just about cats—today, we're gonna whip up a Reddit scraper using Python that can snag images from any subreddit you want! Pretty epic, right? Imagine the look on your crush's face when you show up with a whole bucket of cat pics😼. Just a heads-up, though: this is all in good fun and for educational purposes, so don’t go using your newfound powers for anything sketchy[By sketchy i mean illegal]. Let’s dive in!
Setting up a reddit app
For starters, let’s first set up a Reddit app, which will allow us to access Reddit’s API. This process is straightforward and will provide you with the necessary credentials to authenticate your requests.
Step 1: Log in to Reddit Developer Portal
Start by logging into your Reddit account. Once you're logged in, open https://www.reddit.com/prefs/apps
You will get this screen
Step 2: Create a New Reddit App
On this page, locate and click the button labeled "Are you a developer? Create an app." or "Create another app" This will bring up a form where you can configure your new app.
Step 3: Configure Your App
Select the Type of App: Choose the "Script" option. This is ideal for personal scripts and tools that interact with Reddit's API.
Name and Describe Your App: Give your app a meaningful name and a brief description. This will help you identify the app later, especially if you create multiple apps.
Redirect URL: For the redirect URL, enter
http://localhost:3000/
. This is commonly used during local development, but it can be changed later when you deploy your application.Submit the Form: Once you've filled out all the necessary fields, click the "Create App" button.
Step 4: Secure Your Client ID and Secret
After creating your app, you’ll be provided with a Client ID and Client Secret. These credentials are crucial as they will be used to authenticate your API requests. Make sure to store them securely, as you’ll need them in the next steps of your development process.
(Don’t even think about trying my credentials—I know how lazy some of you are! 😂 But no worries, I'll be changing them before I post this blog anyway!)
Bringing Our Reddit Tool to Life
Wow, you made it this far—I'm impressed! 🎉
Now, let's get our hands dirty with some code. This section is going to be smooth sailing. You can either follow along step-by-step or, if you're more of a "just give me the tool" person, feel free to grab the complete code from my Github Repository . But trust me, coding it yourself is half the fun!
Install the Dependencies
Before we dive into the code, we need to ensure your environment is ready. This tutorial is built with Python, so if you haven’t installed Python yet, now's the time! You can download it from python.org. Once Python is installed, you’ll need to install a couple of essential libraries.
Fire up your terminal and run:
pip install praw requests
or
pip3 install praw requests
A little overview of the libraries we just installed
praw
: This is the Python Reddit API Wrapper, a simple yet powerful library that lets you interact with Reddit’s API.requests
: A popular HTTP library in Python for making requests and handling responses easily.
With these dependencies in place, you’re all set to start building!
Let’s Start Coding
Step 1: Set Up Your Project
First things first, let’s create a folder to keep everything organized. Name it reddit-script
(or whatever you like). Inside this folder, create a new file called main.py
. This is where we’ll write all our code.
Step 2: Setup Reddit API Credentials
Before we can start downloading images, we need to connect to Reddit’s API. This requires setting up credentials using the PRAW library. We’ll need a client_id
, client_secret
, and a user_agent
, which we already got in the previous section
For user_agent
we can write any value, but we will write "postdownloader"
At the beginning of your main.py
file, you need to import the required libraries.
In your main.py
file, set up the connection like this:
import praw
import requests
import os
reddit = praw.Reddit(client_id="your_client_id",
client_secret="your_client_secret",
user_agent="postdownloader")
This code initializes the connection, allowing your script to interact with Reddit.
Step 3: Download Image Function
Now, let's create a function that handles the actual downloading of images. This function, named download_image
, will take two parameters: the URL of the image and the path where you want to save it.
Here’s the code for the download_image
function:
def download_image(url, path):
response = requests.get(url)
if response.status_code == 200:
with open(path, 'wb') as f:
f.write(response.content)
print(f"Downloaded {path}")
else:
print(f"Failed to download {url}")
Understanding what's happening:
Let’s break down the function step by step:
Sending the HTTP Request:
response = requests.get(url)
- This line sends an HTTP GET request to the URL provided as an argument. The
requests.get(url)
function fetches the content from the specified URL, which in this case is an image file. The result of this request is stored in theresponse
variable.
- This line sends an HTTP GET request to the URL provided as an argument. The
Checking the Response Status:
if response.status_code == 200:
- Here, the function checks if the request was successful by examining the status code of the response. A status code of
200
means the request was successful, and the server returned the image data correctly. If the status code is anything other than200
, it means there was an issue with fetching the image.
- Here, the function checks if the request was successful by examining the status code of the response. A status code of
Saving the Image:
with open(path, 'wb') as f: f.write(response.content)
If the response is successful, the function proceeds to save the image.
The
open(path, 'wb')
command opens a file in write-binary mode ('wb'
), which is necessary for writing binary data like images.f.write(response.content)
writes the content of the response (the image data) to the file specified bypath
.The
with
statement ensures that the file is properly closed after writing the data.
Providing Feedback:
print(f"Downloaded {path}")
- After successfully saving the image, the function prints a confirmation message indicating the image has been downloaded and shows the path where it was saved.
Handling Errors:
else: print(f"Failed to download {url}")
- If the response status code is not
200
, the function enters theelse
block and prints an error message, indicating that the image could not be downloaded from the provided URL.
- If the response status code is not
This function is the core of your script, enabling it to download images from Reddit and save them to your local machine with just a URL and a specified file path.
Step 4: Main Function
The main
function is where everything comes together. It’s responsible for taking user input, setting up directories, fetching posts, and calling the download_image
function to save the images.
def main():
subreddit_name = input("Enter the subreddit: ")
num_photos = int(input("Enter the number of photos to download: "))
directory = f"image/{subreddit_name}"
if not os.path.exists(directory):
os.makedirs(directory)
subreddit = reddit.subreddit(subreddit_name)
count = 0
for post in subreddit.top(limit=1000): # Fetch a large number of posts to increase chances of finding enough images
if post.url.endswith(('jpg', 'jpeg', 'png')):
count += 1
image_name = f"{directory}/{subreddit_name}_{count}.jpg"
download_image(post.url, image_name)
if count >= num_photos:
break
print(f"Downloaded {count} images from r/{subreddit_name}")
Here’s the breakdown:
User Input: The script first asks for the subreddit name and the number of images to download.
Create Directory: A directory named
image/[subreddit_name]
is created to store the images.Fetch and Download Images: The script fetches posts from the specified subreddit, checks if the post contains an image, and then downloads it.
Step 5: Running the Script
End your code with the following lines to ensure the script runs when executed directly:
if __name__ == "__main__":
main()
This ensures that the main
function is called when you run the script, starting the process of downloading images based on your inputs.
The Grand Finale: Let the Fun Begin!
You're all set! Now, it's time to fire up the app and start filling your folders with all the cat pics, memes, or whatever else your heart desires. Hit that run button, grab some snacks, and watch the magic happen as your image collection takes off. Happy downloading, and enjoy the meme hoard!
And hey, stay tuned—there’s more cool stuff coming your way!