Automate cleaning up your comments using the StackExchange API

0

Full article

Back when I still used Twitter, I implemented Vicky Lai's "ephemeral" Go script to delete tweets older than a certain age. No one reads your old tweets unless they have a reason to, like you're running for public office or applying for jobs. And people say far too many stupid, snarky, callous things online to just have it all lingering out there attached to their names forever, yet that's what most people do.

Recent events at SE have me thinking about my contributions to their network too, and which of those I should/could clean up. Like Twitter, comments on SE have a limited shelf-life, and there's no reason to leave them hanging around. With the help of the Stack Exchange API, you can even automate deleting your comments.

A couple things to note:

Authentication

Before you can do much of anything, you'll need to prove who you are, which allows you to make a greater variety and number of API calls. It's a convoluted multi-part process that makes simple experimentation and personal scripts a huge pain.

1. Create a new application (to get a key and client id)

Open Stack Apps, create a new account if you don't have one yet, and fill in the required info. Since this is just an app for your own use, don't worry about most of the values. I specified a GUID for my app name, and "localhost" for the domain and website.

2. Create a post for your app (required for write access)

Post a new thread for your "app", because it's required for write access. Since the app is not ready for production (duh), follow these placeholder instructions.

  1. Make an (or ) post for your app (or script).
  2. Tag it with .
  3. Add "PLACEHOLDER -" to the beginning of the title.

If you decide at some point that you do want to make this thing public, there are more detailed instructions here. But we don't need those right now. It's probably a good idea to delete the post when you're done using your app/script.

It'll be interesting to see if this survives. Given the level of snark around creating practice apps, I don't know if my "placeholder" post will survive the couple of months I intend to run this script.

3. Edit your application (to include the URL of the post)

Believe it or not, this is required for write_access.

Apps must have a registered Stack Apps post to write. All content created via the API will have links pointing back to an app's Stack Apps post, to aid in giving an app's author feedback and in reporting abusive content.

You can add or change your app's registered post from the Stack Apps App Management page. Removing a registered post will disable write for your application, as will deleting the registered post.

Now go back to your apps and open the one you just created. Scroll down to "Apps Post" and start typing the title of your new post. Select it, save, and voila. I didn't see the "Apps Post" field when I created the application originally - maybe I missed it?

After you submit changes, you'll see your new post title on the summary page

4. Generate an access token (using your new client id)

Per these instructions, enter the following URL into a browser and click "Approve".

This is a one-time process with the no_expiry scope applied - something I would not advise doing if you were creating a real app, although the default expiry of 1 day seems excessive.

https://stackoverflow.com/oauth/dialog?client_id=<your_client_id>&scope=no_expiry,write_access&redirect_uri=https://stackoverflow.com/oauth/login_success
Stack Exchange API Documentation authorization prompt
The last two bullets are due to the write_access and no_expiry scopes.
Stack Exchange API Documentation authorization success
After clicking "Approve", you receive an "access_token" with no expiration date.
Nothing to it.

Kicking the Tires

Okay, we finally have all the cogs and bolts in place, and it's time to see how the automatic back scratcher StackExchange API actually works..

Get a Comment

I'd suggest using the Postman Client to make your API requests when you're experimenting. It's easy to use and keeps everything sync'd in the cloud. Here's my request, and the resulting 3 comments I got back (because I set the pagesize).

Delete a Comment

Now we can try deleting a comment. Just grab the first one returned in the previous GET request, and do a POST to delete it (I hate that). All gone (hopefully).

Now you see it!
Now...
... you don't!

Many More Advanced

I don't feel like running this everyday to clean up my old comments, so I created a .NET Core app called SECommentHoover that uses RestSharp to get my comments and then delete them one-by-one. There's a built-in throttle to limit how fast API calls are made. I won't repost the code here - check out the repo.

Here's the output from running it. First it deletes some upvoted comments, which are limited to 20 unique posts a day. Then it deletes other comments, up to the 10,000 daily quota limit.

Setup AWS Lambda

Based off of work I did awhile back to tweet random blog posts, I setup an AWS Lambda job to run this once a day. Now I don't have to click on a bunch of individual comments one-by-one, and that makes me happy.

Get the code

  1. Clone SECommentHoover to your machine, open it in VS, and "publish" it in order to generate the full executable and RestSharp NuGet dependency (see below).
  2. Find the DLL files in bin/Debug/netcoreapp3.0 and select them all.
  3. Zip up the files in the directory, but not the directory itself.

Publish the app (locally)

  1. Right-click the project in VS and select "Publish".
  2. Select "Folder" on the left in the popup.
  3. Click "Create Profile".

4. Click "Publish" and then zip the files like I described above.

Create a Lambda function

Create a new AWS Lambda function. You can see how I did it before, or just set things up accordingly:

  1. Create a new function and choose whatever the latest .NET Core runtime is, currently .NET Core 2.1 (even though my app is .NET Core 3.0).
  2. The name of the function and the role you create don't matter.
  3. Under "Function code", click "Upload" and upload your zip file.
  4. Set the handler as SECommentHoover::SECommentHoover.Program::Main
  5. Under "Basic settings", decrease the memory to 128MB and increase the timeout to the max allowed 15 minutes... more on that in a moment.
  6. Set the environment variables that you'll need. These include SE_NETWORK_SITE (i.e. "stackoverflow"), MS_BETWEEN_API_CALLS (i.e. "500"), and ACCESS_TOKEN and KEY which are generated in the previous steps.
  7. Click "Add Trigger", choose "CloudWatch Events", then "Schedule expression", and set a cron expression, like cron(0 12 * * ? *) for everyday at noon UTC. While you're testing this, you might want to uncheck the "Enable trigger" box.

A word of caution

The max AWS Lambda timeout is 15 minutes. If you're planning on making 10,000 requests per day (the max allowed by the API), you'd have to set the MS_BETWEEN_API_CALLS value to about 90ms. While that's still far above the 30ms that breaks the "30 requests per second" rule, and we're just doing simple deletes instead of requesting tons of data, you might want to set the value higher and schedule your job to run a couple times a day. Or not. Caveat emptor.

Also, and I haven't been able to recreate it locally, my AWS logs show 20 lines of output for each upvoted comment it tries to delete, and the max is supposed to be 20, yet I'm able to go to the site and delete a few more. I haven't checked it closely but one of several things are happening:

  • The job is deleting one comment in a particular thread, but I have other comments on the same thread that aren't deleted.. then I just happen to hit those when I manually delete a few more, which doesn't count toward the max unique posts limit.
  • My delete code fails for certain comments, even though it works for most, and AWS doesn't log the resulting exception even though it should since I don't catch anything in the code.
  • Some of the "delete" calls failed, but since they don't return an error I have no way to know why.

After investigating the results of a few runs, it looks like their API just randomly fails. If I track down the comment myself and delete it, it works. If I run the job again manually, it invariably grabs the same comment id and does delete it the second time around. I may modify the job to grab 25 or 30 comments, knowing a few may fail.

Author

Grant Winney

Is there anything more satisfying than sharing knowledge? Of teaching someone and witnessing their "ah ha" moment? I usually write about tech, but no promises. I hope you find something interesting!



Comments