3 Ways to Use XPaths with Large Site Audits

When used creatively, XPaths may additionally assist improve the effectivity of auditing big internet sites. Consider this one different software program in your search engine advertising and marketing toolbelt.

There are infinite types of information chances are you’ll unlock with XPaths, which will be utilized in any class of on-line enterprise.

Some customary strategies to audit big web sites with XPaths embody:

In this data, we’ll cowl exactly how to perform these audits intimately.

What Are XPaths?

Simply put, XPath is a syntax that makes use of path expressions to navigate XML paperwork and set up specified parts.

This is used to uncover the exact location of any ingredient on an online web page using the HTML DOM building.

We can use XPaths to help extract bits of information comparable to H1 net web page titles, product descriptions on ecommerce web sites, or really one thing that’s obtainable on an online web page.

While this may increasingly often sound superior to many people, in observe, it’s actually pretty simple!

How to Use XPaths in Screaming Frog

In this data, we’ll be using Screaming Frog to scrape webpages.

Screaming Frog presents custom-made extraction methods, comparable to CSS selectors and XPaths.

It’s solely attainable to use totally different means to scrape webpages, comparable to Python. However, the Screaming Frog methodology requires far a lot much less coding knowledge.

(Note: I’m not in any methodology presently affiliated with Screaming Frog, nonetheless I extraordinarily advocate their software program program for web scraping.)

Step 1: Identify Your Data Point

Figure out what data stage you want to extract.

For occasion, let’s fake Search Engine Journal didn’t have creator pages and likewise you wished to extract the creator title for each article.

What you’ll do is:

  • Right-click on the creator title.
  • Select Inspect.
  • In the dev devices parts panel, you’ll discover your ingredient already highlighted.
  • Right-click the highlighted HTML ingredient and go to Copy and select Copy XPath.

2 copy xpath

At this stage, your laptop computer’s clipboard might have the desired XPath copied.

Step 2: Set up Custom Extraction

In this step, you will want to open Screaming Frog and organize the website online you want to crawl. In this event, I’d enter the entire Search Engine Journal URL.

  • Go to Configuration > Custom > Extraction

3 setup xpath extraction

  • This will carry up the Custom Extraction configuration window. There are loads of decisions proper right here, nonetheless within the occasion you’re making an attempt to merely extract textual content material, match your configuration to the screenshot underneath.


Step 3: Run Crawl & Export

At this stage, you will have to be all set to run your crawl. You’ll uncover that your custom-made extraction is the second to ultimate column on the exact.

When analyzing crawls in bulk, it’s sensible to export your crawl into an Excel format. This will allow you to apply various filters, pivot tables, charts, and one thing your coronary coronary heart needs.

3 Creative Ways XPaths Help Scale Your Audits

Now that everyone knows how to run an XPath crawl, the possibilities are infinite!

We have entry to your entire options, now we merely need to uncover the exact questions.

  • What are some options of your audit which will very properly be automated?
  • Are there frequent parts in your content material materials silos that could be extracted for auditing?
  • What are a really highly effective parts in your pages?

The precise points you’re making an attempt to resolve might differ by commerce or web site variety. Below are some distinctive circumstances the place XPaths might make your search engine advertising and marketing life less complicated.

1. Using XPaths with Redirect Maps

Recently, I had to redesign a web site that required a model new URL building. The former pages all had parameters as a result of the URL slug instead of the net web page title.

This made making a redirect map for a complete lot of pages a whole nightmare!

So I believed to myself, “How can I easily identify each page at scale?”

After analyzing the various net web page templates, I received right here to the conclusion that the exact title of the net web page appeared like an H1 nonetheless was actually merely big paragraph textual content material. This meant that I couldn’t merely get the standard H1 data from Screaming Frog.

However, XPaths would allow me to copy the exact location for each net web page title and extract it in my web scraping report.

In this case I used to be prepared to extract the net web page title for your entire earlier URLs and match them with the model new URLs by the use of the VLOOKUP function in Excel. This automated lots of the redirect map work for me.

With any automated work, you might need to perform some spot checking for accuracy.

2. Auditing Ecommerce Sites with XPaths

Auditing Ecommerce web sites shall be one in every of many tougher types of search engine advertising and marketing auditing. There are many additional parts to ponder, comparable to JavaScript rendering and totally different dynamic parts.

Sometimes, stakeholders will need product diploma audits on an advert hoc basis. Sometimes this covers merely courses of merchandise, nonetheless sometimes it might be the entire web site.

Using the XPath extraction methodology we found earlier on this text, we’re in a position to extract all types of data collectively with:

  • Product title
  • Product description
  • Price
  • Review data
  • Image URLs
  • Product Category
  • And relatively extra

This may additionally assist set up merchandise which can be lacking treasured information inside your ecommerce web site.

The cool issue about Screaming Frog is that you’d give you the option to extract plenty of data elements to stretch your audits even further.

3. Auditing Blogs with XPaths

This is a additional frequent methodology for using XPaths. Screaming Frog permits you to set parameters to crawl specific subfolders of internet sites, comparable to blogs.

However, using XPaths, we’re in a position to transcend simple meta data and seize treasured insights to help set up content material materials gap alternate options.

Categories & Tags

One of the most typical strategies search engine advertising and marketing professionals use XPaths for weblog auditing is scraping courses and tags.

This is crucial on account of it helps us group related blogs collectively, which can additionally assist us set up content material materials cannibalization and gaps.

This is often the first step in any weblog audit.


This step is a bit more Excel-focused and superior. How this works, is you organize an XPath extraction to pull the physique copy out of each weblog.

Fair warning, this may increasingly often drastically enhance your crawl time.

Whenever you export this crawl into Excel, you are going to get your entire physique textual content material in a single cell. I extraordinarily advocate that you just disable textual content material wrapping, or your spreadsheet will look terrifying.

Next, inside the column to the exact of your extracted physique copy, enter the following parts:


In this parts, A1 equals the cell of the physique copy.

To scale your efforts, you could have your “keyword” equal the cell that accommodates your class or tag. However, you would possibly ponder together with plenty of columns of key phrases to get a additional appropriate and durable picture of your working a weblog effectivity.

This parts will present a TRUE/FALSE Boolean price. You can use this to quickly set up keyword opportunities and cannibalization in your blogs.


We’ve already covered this example, but it’s worth noting that this is still an important element to pull from your articles.

When you blend your blog export data with performance data from Google Analytics and Search Console, you can start to determine which authors generate the best performance.

To do this, sort your blogs by author and start tracking average data sets including:

  • Impressions – Search Console
  • Clicks – Search Console
  • Sessions – Analytics
  • Bounce Rate – Analytics
  • Conversions – Analytics
  • Assisted Conversions – Analytics

Share Your Creative XPath Tips

Do you have some creative auditing methods that involve XPaths? Share this article on Twitter or tag me @seocounseling and let me know what I missed!

More Resources:

Image Credits

All screenshots taken by creator, October 2019

Tags: , , , ,