Are You Missing an Important Piece of Analytical Information?

Are you missing important analytical information

Are you missing PDF download statistics?

PDFs downloaded directly from a Google search are most likely not recorded as events in Analytics.  Here is why –

Scenario 1 Direct PDF search on Google. If you search for something like “Garage windows” and add PDF – “Garage windows pdf” you will get a whole list of PDF files such as brochures etc.  These website files are directly indexed by Google.  If you click any of these links, the PDF file is downloaded to a browser PDF viewer and pulled from the web server without executing any WordPress web pages.  A surprise to many people, but these file downloads bypass any ability to execute analytics code or Tag manager.  Nothing is counted in Analytics for this scenario.

PDF Search Results

In Scenario 1 the domain host computer may track statistics about downloads, but these statistics are not tied to Analytics with the rest of the website data.  Scenario 1 is quite common and represents a huge loss of data statistics for websites that have PDFs and indexed images or other file types for downloading.  Since webpages are not executed the Google Analytics (GA) script code is never run.

Scenario 2 Load PDF while on website.  This scenario is what most people expect- and it does record the analytics if properly handled.  If one goes to the website first, and then clicks on a PDF file link on the website to download the file. Now the analytics can count this as a click event.  Using Tag Manager or other coding techniques, statistics can now be gathered about the file download.  Since at least one webpage is opened in this scenario- the website can now gather statistics.  However this scenario misses the fact that Google has indexed website files and directly displays them if requested- so Scenario 1 represents huge loses in data.

So how can one pull statistics into Analytics for Google indexed file searches?

Can a methodology be created to allow files indexed by Google to execute GA code and record statistics?  I have struggled with this problem for years for clients that have had hundreds of PDF files.  The first technique I used, which was very labor intensive was to create a unique PDF page for every PDF file.  This technique does provide analytics, but the labor cost may be too significant for some websites.

Another approach that I developed after studying the situation, was to build a technique that automatically redirects PDF file requests.  But here it is important understand what code the domain website server executes when providing a PDF or other file type to a Google search.  It turns out that the website domain .htaccess file code does get executed with a PDF file request.  One can simply redirect these PDF file requests to a PDF showcase (or template) webpage.

Once on the showcase page the requested PDF can automatically be embedded into the showcase page and displayed in the context of a website page.  Now since a webpage is executed- GA can now record the fact that the PDF file was requested.  However, we still need a way to convey the name of the PDF file to GA, which can be done by automatically adding a website search parameter.  Example

https://example.com/showcase/?r=abc.pdf or

https://example.com/showcase/?r=123.pdf

Now GA can record how many PDFs are directly downloaded, and we can collect how often each PDF file is downloaded.  So, the above examples will show up as separate webpages in GA, because the search parameters are different.  But GA also records how many times each file is viewed along with all the other GA statistics.

Steps required

The technique requires 3 simple additions- that can quickly automate the embedding process for all website PDF files at once:

  • Code in .htaccess to redirect file requests to a webpage.
  • Create a dynamic showcase template page for displaying embedded files.
  • Disallow paths to embedded files with the robots.txt,

Examples

The files below will be redirected to a showcase page but with the specific image or pdf file embedded.  Without the special code on the following examples these file requests would usually bring up an image or the PDF file without executing any webpages on my website.  However, with the code in place these links are requesting embedded files that will show up as webpages on my website.

https://wsidignet.com/images/testimage1.jpg

https://wsidignet.com/images/testcase2.pdf

Conclusion – Stop missing out on PDF download statistics

This latest technique requires only minimal modifications to a website to automatically build an approach for all of the PDFs or other files on the website.  Start taking advantage of a missed opportunities for your analytics and automate your file downloads with a similar approach to the one described here.

The author of this technique is Mark Colestock an Analytics Internet Consultant at WSI Dignet, mcolestock@wsidignet.com