How To Backup Your Data From Reddit
For those who didn’t catch the memo, I’m a massive advocate for taking (3–2–1 compliant) backups.
I don’t only backup my computer and NAS. I also make sure that everything I entrust to the cloud — including these writings on Medium — is backed up in at least two other places (typically the NAS and archival cloud backup storage).
Why backup your cloud data in the first place, you might ask?
Let’s take Reddit as it’s the subject of this article.
What would happen if:
- Your Reddit account were hacked? (To minimize the chances of this happening, configure 2FA). Wouldn’t you want a copy of all the posts you sent up to the network?
- You somehow managed to lose access to both the email you had set up and the password and couldn’t access the account?
- Reddit vanished off the face of the earth one morning and all the hundreds/thousands of posts and comments you had contributed to the network went with it?
Cloud does not equal backup — it’s just somebody else’s (professionally managed) computer or network of them.
If you want to remain 3–2–1 compliant (the golden rule of backups) then you should keep two copies of all primary data. One of those backups should be stored offsite. And both should be on separate storage media to the original data source.
How do we implement this to protect the comments, votes, and posts that we entrust to Reddit’s digital custodianship?
To my knowledge the best way to do this is to manually export your data every X months. Yes this isn’t ideal (you really want your backups to be automated). But I’d rather do this than never take a copy of my post history.
How To Request Your Data From Reddit
Reddit doesn’t have a native data export functionality — at least at the time of writing.
Instead the procedure is to file a data request with the team. You do so through this link:
Here, you will have options to:
- Request data exports under the General Data Protection Regulation (GDPR) or the California Consumer Privacy Act (CCPA).
- Request a data export from your full time at Reddit or only between a certain time period.
I typically use the GDPR data request option and download the archive that pertains to my full time on Reddit.
After a couple of days you will receive a private message from /u/RedditDataRequests:
It contains a download link which will download your export archive.
Here’s what the archive contains (again, screenshots current at the time of writing):
Here’s what the data looks like. I clicked into the comments.csv archive which contains an export of all the comments that you have written on Reddit. Yes all of them — mine had more than 2,400 rows!
The comments archive contains:
- The unique comment ID
- The permalink to the comment iself
- The data and time in UTC when the comment was left
- The subreddit it was posted on. The subreddit name appears without the reddit.com/r prefix
- A link to the parent thread
- The unique ID of the parent thread
- The body text of the comment itself
If you open chat_history.csv you’ll get a full record of all the private messages you’ve exchanged with other Reddit users:
All your voting history (upvoting and downvoting) is meticulously recorded in comment_votes.csv:
ip_logs.csv has a full login history, including your registration IP:
If you ever want to move to a new account and can’t remember which subreddits you subscribed to, you can find this information out by going over your subscribed_subreddits.csv file:
To receive posts like this to your inbox, please consider signing up for my personal email newsletter: