One of the most common questions we get regarding Private Blog Networks is “what are your data minimums”? While, thankfully, the days of caring about pagerank are likely gone forever, we still need to measure a domain’s quality based on certain metrics. Since I don’t have an inside man at the Googleplex, I’m completely reliant on common sense and empirical data when selecting sites for my network.
Way back, when meta keyword tags were a serious ranking factor, I read about Google’s LSI patent and had a million thoughts and feelings about the future of search. Since then google has hired a mascot, Yahoo gave up and sold out to Bing & as webmasters, we’ve had every real metric that had value close down (including PR, Yahoo’s Site Explorer & “Not Provided” queries in analytics programs).
Looking back, one thing is for sure: My game at the time simply wasn’t good enough. A decade later, it’s safe to say the dust has settled and my game has gotten considerably better. Ahrefs & Majestic also arrived on the scene with perfect timing. Both are fantastic products that are insanely useful for all webmasters. For PBN builders, it’s almost the most necessary tool of the trade.
While everyone has a bias towards certain data, there are really only a few options that webmasters care about, so I’ll cover them all before getting into the “whys”.
- TTF: an acronym for “TOPICAL TRUST FLOW”, this is my favorite metric of the bunch (explanation as to why later). Basically, TTF is a metric that sorts your niche into categories depending on the main category for the page that’s linking to you.
- Anchor Clouds: Majestic and Ahrefs both have one. Majestic is the the faster of the two to check out at a glance. There are fewer links and the pages load faster. Ahrefs is much more thorough but it’s well below the fold; It’s pretty much the last of the ajax data to load. With anchor clouds, I care less about well-themed keywords than I do about generic anchor text. I want to see the natural anchor text: owner’s name, raw urls & branded anchors. A good mix there is almost a guarantee that the site is clean.
- Majestic CF: Citation Flow as a metric is about references or citations. CF is the most malleable of the two since it’s really easy to get a website to list your url in plain text compared to actually being able to drop a link.
- Majestic TF: Trust flow is a metric about connections. Majestic starts off with a few trusted sites and then figures out how many “clicks” away from these hubs your site is. The more links from hubs, the higher your trust flow.
- Ahrefs DR: Ahrefs would like to define Domain Rating as the overall link strength a domain has. As a raw metric, you should be skeptical because it’s very easy to game. It’s extremely effective if you take it in context of a normal site. Most legitimate sites that don’t have a huge budget tend to hover somewhere around the high 20’s, low 30’s. If you take a few standard deviations from that number, you’ll see a sweet spot somewhere between a DR of 25 & 40. There are very legitimate sites above 40, but it’s honestly 90% spam.
- Moz Spam Score: The most poorly-named statistic of the bunch, relating to what it actually does. Moz scrapes the serps and uses data to categorize sites. As sites get deindexed, moz stores away the metrics so they can build a list of what not to do. While a great idea and useful, everyone is under the impression that it’s a metric of how spammy backlinks are.
- Parallel Niches – Reality check here. Does Coke link to Pepsi? Of course not. Do doctors link to other doctors? Generally not, unless they’re in the same practice. Is there any chance a roofer looking for work refers visitors to the competition—sure, but in private. So why on earth would you go looking for expired domains in the same exact niches for a link. If the goal is to fly under the radar, this is a pretty big footprint to anyone looking at the data from a bird’s eye view. It’s unlikely to be a smart long-term play.
- Quality Archive Copy – I am a huge fan of, and greatly appreciate the service that archive.org provides. I like it even more that they allow me to grab archive copies of sites I’m looking to reboot. While there a lot of obstacles when removing footprints from downloaded records, none of this would be possible without them. In order for your rebuilt site to look legitimate, you’ll at the very least need every page linked to from the homepage. Some of the deeper pages it’ll be fine to 301 to the homepage but sites that only have 1-3 pages are not realistic because nobody redirects “about” or “contact” back to the index. Also, keep in mind that it really makes sense to keep the question “where am I going to drop this link” in mind while browsing archive.org. Sucks to buy something and have to make a link fit unnaturally. That’s a pretty big black eye if/when a manual reviewer ever shows up.
Getting to the Heart of the Matter
Data is fun to peruse, however it’s far more efficient to spend your real analytics crunching efforts on things that matter like CTR & split testing. You’ll be no better off spending 10 hours looking at data-points from 1000’s of domains that you don’t own. The only way to make any sense of the matter is to cull the herd. Here are my standards, as you develop your own biases I strongly suggest you revise this based not on those biases, but the results you see from your labor.
CF & TF combined need to be more than 20. TF needs to be equal to, or higher than, CF—No exceptions. It’s also useful to check out a couple of deeper pages if they’re handy (yeah, it’s good idea to have a site’s archive in one tab while Ahrefs, Majestic, etc are in other ones). Spam is easy to build to a single page, deeplinking a site is less frequently fraudulent.
I like my DR higher than 25 but lower than 45. For some reason, ahrefs allows this data point to be manipulated fairly easily. Any site that has a pulse is above 20, big link building campaigns tend to push DR very high and it shows when browsing exclusively high DR sites.
I need an archive.org copy with multiple records from a legitimate business. There’s no easy way of getting “completeness” of a site so I rely on the site being important enough for the archive bot to periodically scrape.
To figure out if a site is “legitimate” aka not previously a PBN…
- How’s the logo? Is it authentic, generic or just words
- Check out the about page, it’s could be labeled “company” or “board”, “hive” or something creative like that. Legit companies often have legit employees, a mission statement, and other distinguishing sections of the site that scream “hey, I’m real”
- Maps, addresses, phone…usually indicators of real, not
- Blog platforms are a no-no. You can build a site on wordpress without it being a basic blog. Big red flag here
- Storefront pictures. If you assume vanity, they’ll never disappoint.
NO spam in any anchor cloud. For me, this isn’t just about viagra, rather any affiliate niche (UGGs & Nike obviously), but also commonly-used niches like weight loss & muscle gain.
I only briefly glance at Moz’s Spam score right before I buy the domain. If it’s like 5+ I might look deeper into it. It’ll almost certainly go up after we relaunch the sites since rarely are we able to move 100% of the content.
One obvious stat I skipped over is the links & referring domains data point. Majestic, Ahrefs & OSE all have this included, but it’s so misleading. Ideally we’d be buying a domain with hundreds of links from dozens of unique domains…but if it’s one or two links from site that matter, I will probably buy it. I always recommend you take special care, and review referring domains/links/link type carefully. The last thing I want to do is rehabilitate a property that was just some noob’s sandbox.
For those that made it to the bottom of the article, congrats, you’re marginally smarter for it.
Now for Something More Actionable:
Alex Dealy, our SEO Director, put together a report of some SERPs for our clients and looked a little deeper. Here’s what we found …
If you want a simple and straight-forward way to find quality domains without all runaround, jumping in bed with TTF wouldn’t be a bad idea. And logically, it makes sense. Your money site should be linked to from other sites in your niche’s broader categories. To find out the desirable categories for your money site you can do one of 3 things (preferably all):
- Check out the TTF for a wikipedia page in your niche.
- Check out the TTF for known personalities in your niche’s twitter profile.
- Run a version of the above screenshot and check out the TTF of who is actually ranking.
The great part about this shortcut is that it’s not about my biases. It’s not about urban myths you read about on your favorite internet marketing forum, and it’s not a secret from a Skype or Facebook group. Nope, being biased towards TTF is based on cold, hard facts. Personally, if I am going to go to all the trouble of purchasing, and rebuilding an expired domain. It better bloody work!
Now that I’ve shared how I qualify domains, please don’t hesitate to share a thought of your own. Thanks!