Why?

Probably you have noticed that I’m comparing four frameworks that have nothing to do with each other. To begin, they are written in different languages, with different modjos and flavors. I’m not interested on the endless and pointless war between: compiled languages vs interpreted, strongly typed vs weakly typed or client side vs server side. What I want to find out is how good are their websites, after all that’s how they present their product to the community

Whenever I have some problems or when I have some doubts with a framework, the first thing I do is to google it, like most of us, but lately I realized that some questions on Stackoverflow are on the top of the results, while the official documentation pages are not. A bad website of a framework could change drastically it’s learning curve, changing also the number of developers using that framework and finally killing the framework

This post is about analyzing the websites of those frameworks from a SEO point of view. I’ll try to detect some problems and try to find out who has the most clean website

Running the analysis

I’m using Botify to crawl and analyze the different websites. The good thing about this service is that it gives you both macro and micro information about the websites.

The crawl was limited to 100k URLs and has no information about sitemaps or robots.txt

Website Overviews

Let’s first compare the main metrics of the analyses:

Metric Spring AngularJS Symfony Django
Crawled URLs 100,038 54 44,801 100,039
Compliant URLs 88,827 (88.79%) 22 (40.74%) 31,473 (70.25%) 68,575 (68.55%)
Not Compliant URLs 11,211 (11.21%) 32 (59.26%) 13,328 (29.75%) 31,464 (31.45%)
2xx URLs 98,026 (97.99%) 23 (42.59%) 41,363 (92.33%) 84,152 (84.12%)
Compliant URLs with Bad H1 79,952 (79.92%) 21 (38.89%) 28,561 (63.75%) 47,460 (47.44%)
Compliant URLs with Bad Description 88,508 (88.47%) 21 (38.89%) 31,467 (70.24%) 68,575 (68.55%)
URLs with 1 Follow Inlink 32,733 (32.72%) 2 (3.70%) 4,877 (10.89%) 18,617 (18.61%)

The first thing that pops out is the number of crawled URLs for AngularJS, only 54. After a quick navigation on Angular’s website I found out that they rely on JS for the navigation, so usually a crawler won’t be able to crawl all the available pages. I did a search on google to find out how many pages google has indexed for Angular and I found out that Google has indexed 2,550 URLs, probably thanks to sitemaps or because Angular belongs to Google and so they don’t really care about SEO problems. But hey! Bing was able to find 14K URLs on Angular’s website… Say Wut?… In that the case, Angular is out of the competition, just because I couldn’t get the correct information

Both Django and Spring have reached the limit of crawler URLs set for this experiment. However the crawler found more than 300K URLs on both. Probably there are some forums installed, or maybe there are some “code reference” pages that were automatically generated, displaying all the code and comments on each file, class, method, etc. Meanwhile Symfony stays with only 44K pages but Symfony also has the highest percentage of Not Compliant URLs, almost 30% (basically URLs that have no SEO value cf: SEO Compliant URLs)

Seems that none of them really care about H1 and Description tags. However Django makes an effort setting a correct H1 tag on its pages. But we’ll see it in more detail later.

Distribution

This is about depth, content-type, protocol and language.

Spring

  • 11% percent of Spring pages are not SEO Compliant, and when you see the pie chart you realize that most of those non-compliant URLs are not HTML pages. After exploring those URLs I see that there are a lot of plain text pages, json files and checksums. Most of them used on the “Guide” pages as an example.
  • Average depth seems OK
  • The site is also available on German but 33% of the site does not have the language metatag
  • The pages are well separated in subdomains where docs.spring.io is 70% of the site and 20% is the repository that maybe should not be indexed
  • 80% of the pages are gziped!

Django

  • Average depth seems a little high but that’s probably because I miss-configured the crawl, the start url is a 302 redirection
  • Almost 80% of the pages belong to code.djangoproject.com (probably automatically-generated pages) and half of those pages are not compliant
  • The documentation pages are only 3% of the website
  • 10K pages need to be removed from the site structure (probably), due to bad http code response or bad content-type.

Symfony

  • It only has 6 pages with different “description” tag
  • Not Compliant URLs are placed between depths 2 and 5
  • 13k of not compliant URLs and I don’t see any specific rule on the robots.txt to prevent them from being indexed.

Performance

Spring

  • Performance could be a little better
  • Even some 500 HTTP Codes are slow
  • The fastest subdomain is the blog

Django

  • Django pages are really slow, with an average load time of 2.3 seconds
  • The slowest pages are the 2xx (Http code)
  • code.djangoproject.com is the slowest subdomain

Symfony

  • There’s nothing to say, charts are clear: Symfony website is fast

HTTP Codes

Spring

  • One of the cleanest sites I’ve ever seen. Only a few 3xx pages and even fewer 4xx pages

Django

  • 15% of django’s pages are redirections
  • 6% of the HTTP redirects are permanent (301). Maybe due to a migration.

Symfony

  • Doesn’t have too many problems. Anyway, it has some 404 pages that need to be solved

HTML Tags

Spring

  • Half of the pages have a unique title
  • Descriptions are missing

Django

  • Almost all pages have a title and a H1, but most of them are duplicated
  • Descriptions are missing

Symfony

  • Only a few pages have unique title and H1
  • They made an effort to prevent the duplicate content using canonicals

What have we learned

If you’re like me, you probably skipped all the previous text and jumped directly into the conclusions. The conclusion is quite simple, for what I have seen Spring is the “cleanest” website because:

  • It has the least percentage of not compliant URLs
  • It has the smallest average depth
  • Even when it’s not the fastest framework, it’s not slow neither
  • Almost 100% of 200 HTTP status code
  • Need some improvement on HTML tags, but slightly better compared to the rest of the frameworks

Symfony, on the other hand, has really fast pages and it’s HTTP codes are not that bad, and Django makes an effort on the HTML tags but has a poor performance

Some final words

I’m not a SEO “expert”, but I’m learning, so this article might not be too accurate. You probably noticed that I’m currently working for Botify (that’s how I was able to make the crawls) and this article made me realize that our tool lack of some feature that we’re currently developing. Anyway, using Botify is quite easy to detect problems and navigate the website structure to find out the real cause of the problem. For that, the URL Explorer is quite useful.

If you want to test another framework go to Botify.com and launch a free crawl, and let me know the results