My wife recently started a preschool in Bellevue while I took ownership of her online advertising and analytics. Im no stranger to online advertising. I know well-dialed campaigns are a crapshoot in comparison to promotions offline.
I assumed I could outsmart the system by focusing on traffic from Google searches alone and ignoring display.
Shockingly, I learned how naive I was about the scale and source of the fraud. Below are my findings.
It has been 3 weeks and I have spent close to $400. More than $350 were spent on search. Google Adwords showed 132 clicks, but my wifes business did not see any of it.
Digging deeper, I discovered her sites heat map had no engagement from Google referrers:
Clicks were mostly from direct emails my wife sent outearlier.
How is that possible? It was time to look at the Google sessions.
Selecting CPC medium for paid trafficonly
What are all these near-zero time sessions?
In Googles ideal world, high bounce rate reflects a disconnect between content and what usersexpect
At this point I know that a big portion of clicks did not even engage with the site. The billion dollar question: can I prove it?
So far I found that there were 71 clicks (53% of total clicks) between 2:00 am and 5:00 am.
Adwords confirms the clicks and the hours:
Are parents with young children searching for a preschool in the middle of the night? Time to ask my friends who have young kids. Maybe Im missing something.
Three calls later, I am convinced my results are suspicious. But as, at one time, a student of math, I am well aware that I need a bigger sample size.
Google Search Trends to the rescue:
Search volume by hour of theday
Only 13% of all searches for my clicked keywords happen between 2 and 5 am. The ads shown for my wifes business were triggered by variations of these keywords.
This is unusual because I selected standard ad delivery across my campaigns:
Change History confirmed the same Deliverymethod
So, no, parents are not searching for a preschool in the middle of the night. Moreover, the exposure of my wifes site to search during off-peak times is abnormal.
An additional issue affecting our exposure is how competitors schedule their ads. But, it plays less of a role because of the following observations.
Lets look now at these sessions closer:
This is short for a preschool website. In life there are a few activities that we devote a lot of time to: finding the right preschool must be at the top of the list. I bet that even with crappy sites, parents spend more than a few seconds.
But I still wonder if prospective customers who clicked the ads were immediately overwhelmed by the website.
The bounce rate shows me there was no interest beyond the landing page. But, did prospective customers even see the website? Lets look at the page load time:
Unfortunately, we cant conduct this analysis using just Google Analytics. With Google Analytics, the session duration is expected to be zero for the 100% bounce pages, so even if sessions lasted 10 seconds, GA would still capture them at zero seconds.
Luckily, from the get-go I had more tools in my arsenal. Come GAs competing product, Yandex Metrica, and my favorite Webvisor. Since I had Webvisor installed, Yandex has been tracking full duration of sessions.
Webvisor starts recording as soon as 1 second after the html headerloads
The majority of Google Adwords clicks are zero durations:
Here are visits from Google Adwords. Remember those 2:00 am5:00 am visits?
Measuring session duration with the time counter is an approximation. The server is not pinged every millisecond. What was actually 34 seconds, may have shown up as 1 second. And looking at a page for 34 seconds is probably enough time to decide if its crap.
I am not convinced by this argument. Yet, if I play skeptic, I have to entertain this possibility too:
Need to demonstrate the alternative Hypothesis to prove the above 34 second argument isbogus.
Conveniently, Yandex Metrica has a tool to build confidence intervals around Time on Site (for simplicity, assume Page Load Time is 2.53 seconds).
Even at this confidence interval, about a third of the midnight clicks did not see the website because it did not have time to load.
Rejecting the sceptic 34second argument in favor of the alternative (my argument)
To summarize, about 65% of clicks are bounces. 53% are abnormal and occur in the middle of the night. And half of these abnormal clicks (22% of total) are fraudulent.
So far, 22% is my lowest fraud estimate. However, 53% is my best guess. But, who cares? Who can benefit from this information?
My wifes business is competing with many (hundreds of) daycare centers and preschools. Just 9 of these are advertising on Google Adwords for the same keywords:
Adwords advertisers for my clickedkeywords
Most of these schools have hundreds or even thousands of locations across the country:
goddardschool.com, lapetite.com, kindercare.com, brighthorizons.com, kiddieacademy.com, evergreenacademy.com
It is amazing how few preschools advertise on Adwords for our target location. According to Yelp, there are hundreds of them in our location.
Preschools dont advertise?
Have they been pushed out due to low conversion?
After skimming through the above list of large franchises, I discovered the following 3rd party scripts on their websites:
simpli.fi (Local Programmatic Advertising & DSP Platform), silverpop.com (Marketing Automation), Clicktale (Monetizing & Conversion), and Omniture (does not require introduction) among them
Following Googles own suggestion, I searched for simpli.fi virus. It returned a lot of web results - not that I am trying to accuse simpli.fi or any of the above 3rd party services by association. There will always be plenty of bad agents out there. The question is whether Google is genuinely doing absolutely everything to fight them, or is it the case of one hand feeding the other. After all, the interests align closely.
So far I have several theories for what is going on, the central premise being that one of the 3rd party ad services used on these corporate sites, delivers low CPC (good) and high conversion (also good) to its customer by directing spam traffic towards competing higher bidders (not good), such as my wifes business.
In fact, Criteo, Googles largest ad competitor, even conducted its own technology study in a suit against SteelHouse, one of its smaller competitors, when the latter managed to steal away a number of Criteos clients (Lara OReillys original story). The gist of the study: clicks to your site can be generated on behalf of a user without that users knowledge of your website.
Another alternative explanation: Botnets.
When I brought up clicks during odd hours to my dad, he recognized the pattern from his work as a Network Security Expert (he works at Checkpoint). He explained it like this:
A big share of our phones and computers are infected with malware that can send traffic anywhere at someones mere will. To make this traffic less noticeable to the owner of the device, much of this traffic is send when the device is not used by the owner (aka in the middle of the night).
from Sasha, one of Googles fraudfighters
What I am really saying is that someone should put together a test. To put together such a test would only require:
So far such research happened exclusively in start-ups that were quickly bought out by Google. @veritasium conducted his study of Facebook fraud, but the closed nature of the experiment leaves room for Google and Facebook officials to undermine it.
Why not perform an open study?