Writing by Nick Stamoulis on Saturday, 26 of April , 2008 at 10:51 am
Do you know what version of WordPress or Joomla you are using? It doesn’t matter what CSM, blogging software, or other website software that you use, every now and then it will be updated and you’ll need to get the latest version. I highly recommend that you always use the latest version for one simple reason: Security.
Hackers and crackers are smart guys. They learn too. Every time a software is updated, the makers of that software package fix the security issues from the last version. But security only goes so far. The bad guys are out there learning the latest tricks too.
Especially if you are using open source software such as
- WordPress
- Joomla
- Drupal
- Pligg
- Or one of many other software packages available online
Because open source means that anyone can view the source code and modify it. That’s essentially an open door hackers and other bad guys to fiddle with the code to learn how it works. Then, when they figure it out, they create their own brands of software, robots, and malicious tools to break the code and hack into websites that use that software. That’s why you always see new software updates that say they’ve fixed security issues - because there are always security issues.
That doesn’t mean that your website is perpetually unsecure. It just means that you always have to stay one step ahead of the bad guys. And the surest way to do that is to update your software.
Category: Robots, Security, Webmaster Tools
Writing by Nick Stamoulis on Sunday, 13 of April , 2008 at 12:46 pm
As this post from Matt Cutts illustrates, Google is now starting to crawl drop-down boxes:
Now Google is finding ways to crawl through forms and drop-down boxes. We only do this for a small number of high-quality sites right now, and we’re very cautious and careful to do the crawling politely and abide by robots.txt. If you’d prefer that Google not crawl urls like this, you can use robots.txt to block the urls that would be discovered by crawling through a form. But I hope that the dialog above is a pretty good example of why this new discovery method can be helpful to webmasters.
I find two critical pieces to this snippet that should be mentioned. First, right now they are only crawling forms and drop-down boxes for “high quality” sites. I’m not sure what “high quality” sites means, but my guess is that it doesn’t mean every site on line. I can also conjecture that “high quality” probably means something akin to Google’s already outlined quality guidelines for webmasters, meaning that if you run PPC campaigns then your landing pages must meet a certain level of quality. If you don’t then quality probably means the same thing it means for determining PageRank and rankings within the SERPs. That’s just a guess.
The second thing that is critical about this statement is that you can prevent Google from crawling forms and drop-down boxes with robots.txt. This is probably more critical for forms than for drop-down boxes. Most drop-down boxes are only going to include pages that are not protected by passwords, but if you have menu items on your drop-down boxes whose pages are password protected and you don’t want the search engines to access them then you need to include them in your robots.txt file. With forms, it is possible that form information that is saved as html documents on a server could be crawled and that could lead to privacy of your form users being sacrificed or critical information that you are trying to protect might be compromised. That’s a good time to use robots.txt. You should research this a little bit more in-depth.
Category: Robots, Search Engines
Writing by Nick Stamoulis on Monday, 25 of February , 2008 at 2:16 pm
According to Matt Cutts, Google is considering others ways of handling the NoIndex meta tag, as it relates to Search Engine Optimization. Of course, if he had his way he’d make sure none of the search engines indexed a page with the NoIndex meta tag. Currently, all the search engines have their own policies regarding NoIndex. Here’s the breakdown:
- Google doesn’t show the page in any way
- Ask doesn’t show the page in any way
- MSN shows a url reference and Cached link, but no snippet. Clicking the cached link doesn’t return anything.
- Yahoo! shows a url reference and Cached link, but no snippet. Clicking on the cached link returns the cached page.
Personally, I’d rather a page with the NoIndex meta tag not be shown any way at all than to be referenced and cached. I don’t want to click a link in a SERP and get a 404 error page or a page not returned for some reason.
Google currently doesn’t show a NoIndex page. But Cutts is wanting feedback to find out what web users expect. They are considering changing their policy and Matt Cutts points out some of the benefits of doing each possible way. Except that there are more than one way to find “some middle ground in between.”
One way that Google, and the other search engines, could handle NoIndex meta tags for pages that are popular and well known is to analyze whether they have any valuable back links as part of a website’s Search Engine Optimization. If not, then don’t show them. If so then go ahead and index it any way. If a webmaster lists the site in the ODP and has articles out there pointing at it and then decides to add a NoIndex meta tag, they aren’t really serious about NoIndexing. For them, it is likely a temporary situation. The search engines could add another tag - a temporary, or timed, NoIndex - that allows webmasters to NoIndex a certain web page, or an entire web site, for a timed period and once that time period has expired they appear in the SERPs again. That way, webmasters who have issues that need to be fixed can fix their issues during that time period and return to the SERPs. The NoIndex tag can be reserved for parked domains and such.
Category: Robots, SEO
Writing by Nick Stamoulis on Tuesday, 19 of February , 2008 at 1:47 pm
I’ve got to hand it to Andy Beard. Here’s a creative way of getting around Google’s paid link policy. The search engine has effectively told webmasters that they will be penalized for selling links. Many webmasters have gone to instituting nofollow attributes on their paid links so that they aren’t penalized. That penalizes the webmasters buying the links as they don’t get any PR transference from those nofollow links. So will Beard’s Robots.txt solution solve his problem?
It’s possible, but not likely. Robots.txt will allow webmasters the ability to tell Google not to crawl their pages. If you have a page on which you link to other sites that have paid for you to do so then you can prevent Google from crawling that web page altogether. The page will still be indexed, but it won’t be crawled. And the natural follow up question is, “Well, if it isn’t crawled then how will be indexed?”
If there is even one inbound link to that page from anywhere else on the web then it will be indexed. Google may not crawl it, and therefore will not crawl the links on that page so it will potentially never discover that those links are paid links. But what Andy Beard is banking on is that his content being syndicated either by aggregators or scrapers - the first being a legitimate and ethical way of syndication and the second one being unethical. Either way, he wins, as does his link buyers.
Any content that is syndicated will provide inbound links from other sources. Google is unlikely to count syndicated paid links as paid links. The search engine won’t know that they are paid links on the syndicated pages. How would they know? They can figure it out on the original content page, but since it isn’t crawling that page it becomes a non issue. This is what Beard is banking on.
He says he hasn’t implemented the Robots.txt strategy yet. My bet is, if he does, then other webmasters will follow and will get away with selling links for some time. They will even figure out how to syndicate each other’s content effortlessly for a period of time. Eventually, Google will institute another policy. But short term, it looks like it might work.
Category: Link Selling, Robots
Writing by Nick Stamoulis on Saturday, 26 of January , 2008 at 9:30 am
Does your SEO know how to sell? Can he put together a marketing plan for your website? These are skills that more and more SEOs will have to learn in order to be more competitive. Soon, SEOs who know how to SEO a web page, but they don’t know how to use that web page to sell a product, will fall out of fashion. In essence, your website is a sales brochure. Either it converts traffic into money for you, or it’s useless.
SEO is fine and dandy. Yes, you should write your content so that you rank well for your key terms, but does it make much of a difference if you get tons of traffic and no sales? It should.
Unless your website is an AdSense site, you’ll want people to buy your product or service. To get them to do that online, you’ll need to sell them on it, and that means you’re going to have to learn how to write sales copy. That’s what a website is. It’s sales copy that is SEOd. Sales copy that is written in such a way to attract human buyers and search robots.
The fundamental truth to all web writing, and too many SEOs forget this, is to write to two audiences: human visitors and spiders. You’ve got to feed both targets what they want. If you don’t, you’re just writing hack content that isn’t worth the bits and bytes it’s printed on.
Category: Robots, SEO
Writing by Nick Stamoulis on Friday, 18 of January , 2008 at 2:16 pm
So you’ve decided that you want a part of your site to be secure. Good for you. But how do you ensure that it stays secure?
Just in case you aren’t aware of it. Any time you see http:// in an URL, it means you are looking at a non-secure web page. By contrast, https:// denotes that the web page is secure. Typically, you’ll see the https:// in the URL on pay pages, whenever you are purchasing something and you need to give out credit card information or other personal data. If you don’t see the https:// on a page that asks for your personal or private information then run away. Don’t give up your information because it isn’t secure.
But there are other reasons for want to ensure certain web pages are secure. Maybe you have a membership site and you want to restrict access to the membership benefits of your site to your visitors who are not members. You may not necessarily need the https:// distinction in those pages, but you definitely should have a password system to give access to those site visitors who are members. But there are some other things you should do as well.
First, the danger of web security. You might make the mistake, if you are new to this area of webmastering, of leaving an unsecure link leading from the non-secure part of your site to the secure part of your site. If that happens then there is a hole in your security and anyone will be able to get in. To prevent that from happening you can add “nofollow” and “noindex” tags to your robots.txt file or meta tag. By keeping the search engine spiders from crawling your links and indexing your secure pages, you ensure that security holes are not leaked and that future searchers do not find your secure web pages in the SERPs. It’s a small measure you can take to ensure that the secure parts of your website remain secure.
Category: Meta Tags, Robots, SEO, Webmaster Tools
Writing by Nick Stamoulis on Wednesday, 16 of January , 2008 at 2:22 pm
You’ve heard of robots meta tags. But how do you use them? If you’ve been online for very long you likely no about robots “nofollow” tags and maybe even “noindex” tags. But what about “noarchive?” Are you familiar with that one? Should you be?
SiteProNews today published an article by Scott Buresh explaining all the different robots meta tags and when you should use them. The last one on the list is the “noarchive” tag. He hit the nail right on the head:
Finally, almost all search engines today, including Google and Yahoo, offer a cached version of a site alongside its listing that provides a snapshot of what the page used to look like. The noarchive tag, therefore, is available to be used in circumstances where there is content on your website that is of a timely nature and therefore that you might not necessarily want search engine spiders to cache for people to have access to moving forward.
To further back up his claim, he uses the example of a business that is running a timed special. When the sale or special offer is over and the time to take advantage of it has expired, you’ll likely not want future searchers pulling up that web page and finding the special. They’ll want to take advantage of it too. But you can hide that page from those searchers just by using the robots “noarchive” meta tag. It’s a good example of when to use a robots tag to keep certain information hidden.
Another example might be when your company is between CEOs, or other high ranking officials. If your VP is acting CEO until the new guy comes in then you might want to reflect that on a special page that will be visible to search engines and searchers until the new guy starts. That would be another good time to use the “noarchive” tag. You’ll likely think of a few on your own now that you know what it can do.
Category: Robots, SEO
Writing by Nick Stamoulis on Tuesday, 15 of January , 2008 at 8:34 am
Content scraping is getting to be a big business, and the scrapers are getting ever more clever.
In the old days, content scrapers would copy/paste your content and slap AdSense ads on the page where your content was used and make a dollar or two before they were reported. Then those nasty bots came around and they were able to scrape your content without actually visiting your website. The bots did all the work in about 10 minutes. The “operator” simply input a list of keywords that the bot was to hone in on and when those keywords were found the content was scraped and held in the bots “memory.” Then the bot would be told where to put the content that was scraped, right alongside those pesky AdSense ads. There were called Made For AdSense sites, or MFAs. They worked and the scrapers could usually get by with making $10 or $15 per site before being taken down. You do that a couple of hundred times in one month and you’ve got a pretty decent income. Do a thousand times each month and you’ll be a rich man before you know it.
Well, Google seems to have cracked down on the MFA sites because I don’t see as many of them as I used to. But there is another kind of content scraper that is starting to become more popular now. These content scrapers try to appear as if they’re not really scraping your content. They actually give you a link back - unless those old MFA guys who just stole your content with a lack of conscience.
The new scrapers take your content and put a link at the top of it out of the goodness of their hearts. At the bottom of your content (on their web pages, mind you), they link to another site that they own in hopes of selling services and products related to the keywords they are targeting with your hard work. Pretty clever, eh?
You can see a sample of this kind of content scraping on several of my past blog posts, but I’ll just link to this one and you can take a look at that first comment.
I love how Saskatoon Web Design just took my entire blog post. Sure, they gave attribution. I got the link back. They even included the internal links that were a part of my original post. All except the affiliate links (they must have scraped my content before I went back in and added those). But at the bottom of the blog post with the words that I wrote, you’ll see a link to the index page of the website on which the Rawk Media blog sits. Clever. These guys are trying to sell their services using content that I wrote, based on the keyword research that I’ve done. I suppose that works if you can’t write. But it does violate existing copyright laws. If they’re not careful, someone may report them to their ISP.
Category: Content Development, Robots
Writing by Nick Stamoulis on Thursday, 10 of January , 2008 at 7:51 am
Matt Cutts has a good video today on Google Webmaster Central explaining how to prevent certain pages on your website from being crawled by the search engines.
You really need to be familiar with four methods of preventing the spiders from crawling your pages:
- htaccess
- noindex
- nofollow
- robots.txt
- password protect
Your htaccess file is a ticket to solving a lot of your search engine problems. Not all of them, but some of them. It’s a file on your server that gives instructions to browsers and search engine spiders, telling them how to read your web pages. One common usage of this file is to use it to redirect old web pages to new web pages. Frequently, webmasters will update their information and when doing so will change the URL of a web page. Well, if you do that then you still have that old web page indexed and when people try to visit that page they will get a 404 error page. To prevent that from happening, you can add a 301 redirect command in your htaccess to redirect traffic to your new page.
But the htaccess has other uses as well and you can actually use it to tell the search engines certain information that will prevent them from crawling your web pages. More on this later.
Perhaps the most common way to instruct search engines not to crawl certain pages of your website is the robots.txt file. You can use this file to tell all the search engines, or just some of them, not to crawl specific pages. You just give the URLs of the pages you don’t want to be crawled and specify which search engines are not allowed to crawl those pages.
The noindex meta tag is a bit different than the robot.txt file. It tells the search engines not to show a page in their index. They’ll still crawl it, but they won’t show it in their index so anyone searching for a key term will not see that page on that search engine. Again, you can specify specific search engines or make it general for all search engines.
The nofollow meta tag is a tag that tells the search engines not to crawl certain links. So you can actually have a page that links to one other page on your website and make that link a nofollow link then the one page that spins off will not be found because of that nofollow link. You can nofollow all the links on a page or just some of them.
Finally, if you password protect certain pages, the search engines will not crawl them. They cannot guess your password so those pages are safe. Users of your website can get to them, but the search engines cannot. You can password protect your pages using the htaccess file that I discussed earlier.
Keep in mind that there are complications with each of these methods. The safest and most powerful of all of these methods is the htaccess. The least effective is the nofollow tag because while the links aren’t followed, that page is still on a server somewhere. If you access that page from your browser then move on to another page on your website and you have an analytics program that shows links for referrers, that link could get crawled and you’ll still get traffic to the page. Not a lot, but some, and you’ll run the risk of someone else linking to it. You have the same problem with noindex tags and robots.txt files, so be careful.
For more information on preventing your pages from being crawled, watch Matt Cutts’ video on that topic. He also discusses how to de-index certain URLs you have mistakenly indexed.
Category: Meta Tags, Robots, SEO Tools, Search Engines
Writing by Nick Stamoulis on Sunday, 16 of December , 2007 at 8:47 am
In SEO a surprising number of people get a little too caught up in what they are doing. It is astonishing just how many SEO writers actually forget that they are writing for people. People do read SEO content and if you get too focused on the robots, then you are likely to lose out on leads. There is nothing worse than reading an article that is written entirely for robots. It is obvious and it reads terribly, it is a very quick way to turn people away from your site.
While your site does have to be geared towards robots, it is possible to do so without turning away human customers. A smattering of keywords is all that is required. Seriously, once or twice per paragraph is enough. I have seen some sites that come closer to once or twice per sentence.
Do not spend too much time worrying about keyword density, it is not the be all and end all as some would claim. As long as your keywords appear throughout your content, robots will index it. Getting too scientific about keyword density makes content production take longer than is necessary. Keep it simple, replace unnecessary pronouns with keywords if you feel that you need to boost the density a little bit, play it by ear.
If you are concerned about the robot / human balance of your material, you can always contact us at Brick Marketing. We have a great deal of experience at producing high quality SEO content.
Category: Robots
Writing by Nick Stamoulis on Saturday, 10 of November , 2007 at 4:13 pm
There might be a time when you do not want the search engines to crawl certain pages. You can prevent them from crawling certain pages by simply inserting the right code in its proper place. Contrary to some people’s thinking, you wouldn’t use the “no follow” tag.
The no follow tag is used to tell the search engines not to transfer your PageRank to the pages your links connect with. That’s a different thing altogether. The search engines will still crawl those links and index the pages you link to. But if you don’t want them indexing the pages on the other side of those links what would you do?
The proper code is a robots meta tag. It’s called Robots Extension Standard. The robots meta tag uses the following syntax:
meta name=”robots” content=”noindex,nofollow” /
Don’t forget to add the opening and closing brackets (< and >) for this sentence. If you’ll add this code within the HEAD section of your HTML, the search engines will not index that page and will not follow the links. To prevent following links only, take out the noindex attribute. To prevent the indexing of the page only then sans the nofollow attribute.
Category: Robots
The Search Engine Optimization Journal Blog is Owned By Brick Marketing. For more info call 877-295-0620.
This blog was designed by Hostseeq Designer.