Scraping the WordPress Plugin Repo for Search Terms

Today’s code snippet is in the form of a shell script, allowing you to scrape WordPress plugin repository search results. I had reservations about releasing this because it can certainly be used in excess, so please be responsible with it as to not slam the repo to death.

#!/bin/bash
echo "Name, URL, Slug"
IFS=$'\n'       # make newlines the only separator
for j in `curl https://wordpress.org/plugins/search.php?page=[1-12]\&q=search+term | grep '<h4>.*</h4>'`

do
echo -n "$j" | awk 'BEGIN { FS = ">" } ; { print $3 }' | awk 'BEGIN { FS = "<" } ; { print $1 }' | sed 's/\&amp\;/\&/g' | tr -d "\n"
echo -n ,
echo -n "$j" | awk 'BEGIN { FS = "\"" } ; { print $2 }' | tr -d "\n"
echo -n ,
echo -n "$j" | awk 'BEGIN { FS = "\"" } ; { print $2 }' | awk 'BEGIN { FS = "\/" } ; { print $5 }' | tr -d "\n"
echo ""
done

Drop that in a file named WHATEVER.sh and run it on either MacOS or Linux, and it should correctly output data related to each of the plugins matching the search term. Of course, change the page range as well as the search term on line 4.

For more daily code snippets, feel free to subscribe below the post.

Think I’m a super cool guy? Love my beard?

Leave a Reply