Screaming Frog (https://www.screamingfrog.co.uk) is an excellent tool for crawling websites and extracting data, but if it’s not crawling all URLs, you won’t be performing a quality technical SEO audit (auditing on-page meta descriptions, response codes, internal linking, checking duplicate contents, page titles, backlinks, alt texts, etc) on your e-commerce sites. In this blog post, we’ll examine why Screaming Frog isn’t crawling all URLs and how you can fix the issue. So, if you’re having trouble getting Screaming Frog to crawl all of your URLs, stay tuned! You’re in for a treat.
目录
如何解决Screaming Frog没有抓取所有URL的问题
There are several reasons Screaming Frog may not crawl all subdomains on a 网站最常见的是,该网站被配置为阻止像尖叫蛙这样的爬虫。
该网站被robots.txt屏蔽了。
Robots.txt可以阻止 尖叫的青蛙 crawl pages. You can configure the SEO Spider to ignore robots.txt by going to 配置 >> 蜘蛛 >> 高级 >> 取消勾选 Respect Noindex 设置。
You can also change your User Agent 至 谷歌机器人 to see if the website allows that crawl.
Robots.txt is used to instruct web crawlers, or “bots,” on what they are allowed to access on a given website. When a bot tries to access a page that is specifically disallowed in the robots.txt file, it will receive a message that the webmaster does not want this page crawled. In some cases, this may be intentional. For example, a site owner may want to prevent bots from indexing sensitive information. In other cases, it may simply be due to an oversight. Regardless of the reason, a site that is blocked by robots.txt will be inaccessible to anyone who tries to crawl it.
在不被抓取的链接上存在'nofollow'属性。
无标签链接的作用是,它们告诉人们 爬虫 not to follow the links. If all links are set to nofollow on a page, then Screaming Frog has nowhere to go. To bypass this, you can set Screaming Frog to follow internal nofollow internal links.
你可以在以下文件中更新这个选项 配置 >> 蜘蛛 根据 抓取标签 by clicking on 关注内部'nofollow'。 链接。
该页面有一个页面级别的 "nofollow "属性。
ǞǞǞ 页级nofollow属性 is set by either a meta robots tag or an X-Robots-Tag in the HTTP header. These can be seen in the “Directives” tab in the “Nofollow” filter. The page-level nofollow attribute is used to prevent search engines from following the links on a page.
This is useful for pages that contain links to unreliable or unimportant sources. By setting the nofollow attribute, you are telling search engines that they should not follow the links on the page. This will help to improve your site’s search engine rankings but stop you from crawling the website.
要忽略Noindex标签,你必须到 配置 >> 蜘蛛 >> 高级 >> 取消勾选 的 尊重无索引 设置。
用户代理被阻止了。
ǞǞǞ 用户代理 是一个文本字符串,由你的浏览器发送至你正在访问的网站。用户代理可以提供有关你的浏览器、操作系统、甚至你的设备的信息。基于这些信息,网站可以改变其行为方式。例如,如果你使用移动设备访问一个网站,该网站可能会将你重定向到该网站的移动友好版本。或者,如果你改变User-Agent以假装是一个不同的浏览器,你可能能够访问你的实际浏览器中没有的功能。同样地。 有些网站可能会完全屏蔽某些浏览器.通过改变用户代理,你可以改变一个网站的行为方式,让你对你的浏览体验有更多的控制。
你可以在下面改变User-Agent 配置 >> 用户代理.
该网站需要JavaScript。
脚本 is a programming language that is commonly used to create interactive web pages. When JavaScript is enabled, it can run automatically when a page is loaded, making it possible for items on the page to change without the need to refresh the entire page. For example, JavaScript can be used to create drop-down menus, display images based on user input, and much more. While JavaScript can be beneficial, some users prefer to disable it in their browser for various reasons. One reason is that JavaScript can be used to track a user’s browsing activity. However, disabling JavaScript can also lead to issues with how a website is displayed or how certain features work.
尝试 启用JavaScript渲染 在 "尖叫青蛙 "内的 配置 >> 蜘蛛 >> Rendering.
本网站需要Cookies。
您能否在浏览器中禁用cookie来浏览本网站?有许可证的用户可以通过以下方式启用cookies 配置 >> 蜘蛛 并选择 仅限会议 根据 饼干存储 在 高级标签.
该网站使用框架集。
SEO蜘蛛不抓取框架-src属性。
内容类型标头没有表明该页面是HTML。
这显示在内容栏中,应该是文本/HTML或应用/xhtml+xml。
总结
The Screaming Frog SEO spider can be an excellent tool for auditing your website, but it’s vital to ensure that all URLs are crawled. If you’re not getting the complete data that you need from your audits, there may be an issue with how Screaming Frog is configured. This blog post looked at why 尖叫的青蛙 might not be crawling all your URLs and how to fix the problem. By fixing these issues, you’ll be able to get more comprehensive data from your Screaming Frog audits and improve your SEO strategy. Have you tried using Screaming Frog for your website audits? What tips do you have for improving its functionality?
常见问题
为什么Screaming Frog没有抓取所有的URL?
发表于:2022-06-07
Updated on: 2024-04-05