Not all bots out there are good bots. Trolling your sites are malicious automated machines that can cause a number of problems. These bots inflate analytics data and can be responsible for attacks on your site. They are even handy tools for scraping and copying your content. Before you can properly use analytics to accomplish anything you need to have good analytics data. The old saying “Garbage in, garbage out” is doubly true in this case.
Content traffic data is important in determining how successful your content marketing strategy has been over a period of time. It also allows you to develop new strategies built around improving your content marketing.
These bot visits skew the data so that you can’t trust the numbers to gauge your performance. It’s crazy – the 2014 bot traffic report by Incapsula found that over 50% of all website traffic was due to bots. It was also discovered that if you build your brand and become recognizable, it was likely that you would see more of an influx of bots to your site.
3 Key Ways to Deal with Bots & Win
Sounds like a “lose-lose” situation, doesn’t it? If you improve your visibility and market more and get a wider outreach then bots will also take more notice of you and the chances of a bot attack go up along with your popularity. Luckily, you can control bot visits relatively easily. The idea behind blocking bots is to find out where they come from and stop the source.
This might seem like a complicated piece of internet wizardry, but the reality is as simple as the following steps:
- Locate your Log Files: Servers all have a series of log files that make a record of visits to the website from each user based on their IP location. These logs are usually stored on your server. If you have a hosting company that gives you cPanel as your hosting front-end you can simply access your logs by clicking the link in the main cPanel window (the first window you visit). If you use Apache for your front-end, your logs will be in the /var/log folder. IIS users can configure their logging through the local computer’s control panel. In the control panel, you select administrative tools, then internet services manager, then select website, right click and then select properties, select website in the tabs then available, then on to properties and finally the general properties tab. In typical fashion, the Microsoft logs are the hardest to get your hands on.
- Figuring Out the Most Visits by IP and User Agents: When you get your log files downloaded, it’s a simple matter to consolidate them into a single text file and then import them into Excel (or whatever you prefer to view your log files in). Excel is a very innovative way to manipulate data so that you can make sense out of it. When you import your data into Excel you can select the space delimiter to get the right data into the right columns. With a little cleaning up you’ll have usable data almost immediately.
Utilizing Excel’s Pivot Table Builder, you can create a pivot table to link number of visits to Client IP and then get a feel for the counts of visits from malicious IP’s. Client IP’s determine where your visitors are coming from and can easily give you insight into where most of your visitors are based. Renaming the table headers to Client IP, Hits and finally a User Agent column gives you a setup to determine which IP’s have visited your site the most. The User Agent determines the browser version and the operating system version that your visitor was running. Obviously, bots would have none of these so it’s just a matter of determining which ones are blank to pinpoint the presence of bots.
- Blocking the IP: After you’ve figured out which IP is the bot location, you can now move forward in blocking reference to the bots in your analytics reports. Additionally, if you’re concerned about security, you can also block the bots from accessing the site altogether. Google Analytics gives you the option to block individual IP’s. It also comes with a built in bot-checker that you can enable in the Admin panel under View Settings and by selecting “Exclude all hits from known bots and spiders.” A handy tool for filtering your analytics to get a more realistic view of your outreach. Omniture gives you a bit more control about your analytics viewing and tabulation by giving you the option to exclude individual IP’s, exclude a set of IP’s (if you have a large number of bot entries) or create a processing rule that ignores certain IP’s and IP ranges.
On the server side of the spectrum, you can limit the availability of your site to certain visitors based on their IP. CPanel includes a handy IP Deny manager which allows you to enter IP’s that you can deny access to. In Apache, you can utilize either the mod_authz_host module or the .htaccess module, but the former is the more preferred method for controlling access. Open IIS Manager allows blocking through its features view, then navigating to the IP4 Address and Domain Restrictions, then to the actions pane and finally adding the IP address of the bot into the Add/Deny Entry list.
The Threat of Bots and How to Combat Them
Even though bots can be a nuisance, being aware that they exist and are an active threat allows you to deal with them. These methods, although they do work, are not 100% effective in determining whether an IP is a bot. Constant traffic monitors via third party solutions can give you a more secure method of streamlining your traffic and keeping bots away from your pages.
Similarly, the introduction of ReCaptchas can stop bots and prove to be a viable measure, although it can be quite annoying if you’re a human to prove your humanness. At the end of the day the security of your site is of the utmost importance. It’s better to prevent bots from accessing your site than having to deal with the fallout of their malicious actions. Prevention will always be better than cure, and when it comes to websites, bots can deliver quite a blow to the viability of your content marketing strategy with very little effort.