Scrape Reddit for Testing Data using SQL Server & Python

Target Audience:

You’re the person who has to create data for your Test environments. Let’s scrape data from reddit!

Abstract:

You’re lucky enough to have Test environments but you don’t want to spend forever creating test data to use (because you’re not using live data right?). Let’s use SQL Server with Python to scrape data from Reddit to fill those tables. We’ll go through;

1. What you’ll need installed to use Python with SQL Server.

2. Basics of the Python language

3. The python libraries we’ll use

4. I’ll give you the code and walk through it

5. I’ll then show you how you can apply this in your environment.

You can then go away and create your test data so you don’t have to worry about GDPR or any other data privacy issues!

Why I Want to Present This Session:

I’ve recently had to do this for some internal systems, why not share with the community so every body can benefit.

Additional Resources:

The following two tabs change content below.
I'm a performance tuning DBA who loves polishing up SQL Servers.

Latest posts by Rich Benner (see all)

Previous Post
Automate the Pain Away with SQL Server 2019
Next Post
Adventure – Performance Edition

1 Comment. Leave new

I’d be interested in this session. We’re also struggling with test data, GDPR and creating it ourselves.

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Menu