Probably you have noticed that I’m comparing four frameworks that have nothing to do with each other. To begin, they are written in different languages, with different modjos and flavors. I’m not interested on the endless and pointless war between: compiled languages vs interpreted, strongly typed vs weakly typed or client side vs server side. What I want to find out is how good are their websites, after all that’s how they present their product to the community
Whenever I have some problems or when I have some doubts with a framework, the first thing I do is to google it, like most of us, but lately I realized that some questions on Stackoverflow are on the top of the results, while the official documentation pages are not. A bad website of a framework could change drastically it’s learning curve, changing also the number of developers using that framework and finally killing the framework
This post is about analyzing the websites of those frameworks from a SEO point of view. I’ll try to detect some problems and try to find out who has the most clean website
Running the analysis
I’m using Botify to crawl and analyze the different websites. The good thing about this service is that it gives you both macro and micro information about the websites.
The crawl was limited to 100k URLs and has no information about sitemaps or robots.txt
Let’s first compare the main metrics of the analyses:
|Compliant URLs||88,827 (88.79%)||22 (40.74%)||31,473 (70.25%)||68,575 (68.55%)|
|Not Compliant URLs||11,211 (11.21%)||32 (59.26%)||13,328 (29.75%)||31,464 (31.45%)|
|2xx URLs||98,026 (97.99%)||23 (42.59%)||41,363 (92.33%)||84,152 (84.12%)|
|Compliant URLs with Bad H1||79,952 (79.92%)||21 (38.89%)||28,561 (63.75%)||47,460 (47.44%)|
|Compliant URLs with Bad Description||88,508 (88.47%)||21 (38.89%)||31,467 (70.24%)||68,575 (68.55%)|
|URLs with 1 Follow Inlink||32,733 (32.72%)||2 (3.70%)||4,877 (10.89%)||18,617 (18.61%)|
The first thing that pops out is the number of crawled URLs for AngularJS, only 54. After a quick navigation on Angular’s website I found out that they rely on JS for the navigation, so usually a crawler won’t be able to crawl all the available pages. I did a search on google to find out how many pages google has indexed for Angular and I found out that Google has indexed 2,550 URLs, probably thanks to sitemaps or because Angular belongs to Google and so they don’t really care about SEO problems. But hey! Bing was able to find 14K URLs on Angular’s website… Say Wut?… In that the case, Angular is out of the competition, just because I couldn’t get the correct information
Both Django and Spring have reached the limit of crawler URLs set for this experiment. However the crawler found more than 300K URLs on both. Probably there are some forums installed, or maybe there are some “code reference” pages that were automatically generated, displaying all the code and comments on each file, class, method, etc. Meanwhile Symfony stays with only 44K pages but Symfony also has the highest percentage of Not Compliant URLs, almost 30% (basically URLs that have no SEO value cf: SEO Compliant URLs)
Seems that none of them really care about H1 and Description tags. However Django makes an effort setting a correct H1 tag on its pages. But we’ll see it in more detail later.
This is about depth, content-type, protocol and language.
- 11% percent of Spring pages are not SEO Compliant, and when you see the pie chart you realize that most of those non-compliant URLs are not HTML pages. After exploring those URLs I see that there are a lot of plain text pages, json files and checksums. Most of them used on the “Guide” pages as an example.
- Average depth seems OK
- The site is also available on German but 33% of the site does not have the language metatag
- The pages are well separated in subdomains where docs.spring.io is 70% of the site and 20% is the repository that maybe should not be indexed
- 80% of the pages are gziped!
- Average depth seems a little high but that’s probably because I miss-configured the crawl, the start url is a 302 redirection
- Almost 80% of the pages belong to code.djangoproject.com (probably automatically-generated pages) and half of those pages are not compliant
- The documentation pages are only 3% of the website
- 10K pages need to be removed from the site structure (probably), due to bad http code response or bad content-type.
- Performance could be a little better
- Even some 500 HTTP Codes are slow
- The fastest subdomain is the blog
- Django pages are really slow, with an average load time of 2.3 seconds
- The slowest pages are the 2xx (Http code)
- code.djangoproject.com is the slowest subdomain
- 15% of django’s pages are redirections
- 6% of the HTTP redirects are permanent (301). Maybe due to a migration.
- Almost all pages have a title and a H1, but most of them are duplicated
- Descriptions are missing
What have we learned
If you’re like me, you probably skipped all the previous text and jumped directly into the conclusions. The conclusion is quite simple, for what I have seen Spring is the “cleanest” website because:
- It has the least percentage of not compliant URLs
- It has the smallest average depth
- Even when it’s not the fastest framework, it’s not slow neither
- Almost 100% of 200 HTTP status code
- Need some improvement on HTML tags, but slightly better compared to the rest of the frameworks
Symfony, on the other hand, has really fast pages and it’s HTTP codes are not that bad, and Django makes an effort on the HTML tags but has a poor performance
Some final words
I’m not a SEO “expert”, but I’m learning, so this article might not be too accurate. You probably noticed that I’m currently working for Botify (that’s how I was able to make the crawls) and this article made me realize that our tool lack of some feature that we’re currently developing. Anyway, using Botify is quite easy to detect problems and navigate the website structure to find out the real cause of the problem. For that, the URL Explorer is quite useful.
If you want to test another framework go to Botify.com and launch a free crawl, and let me know the results