| View previous topic :: View next topic |
|
|
|
| Author |
Message |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Thu Feb 17, 2005 1:25 pm
|
|
| Post subject: Google uses several algorithms? |
|
|
One of Google's top people (I think it was Matt Cutts) said not too long ago that Google uses several different algorithms at random. I've always found it to be a bit far fetched and I always thought it was a bit of a smoke screen - until last night.
My rankings for 'search engine optimization' are different across the datacenters:- #22 in one group of datacenters, #27 in another group, and #40/41 in another group. Those rankings vary slightly from time to time, but there are several distinct groups.
So last night I checked the allinanchor:, allintitle:, and allintext: results for a sample from each group and I was surprised by what I found.
The #22 group
allinanchor: #39
allintitle: not in the top 1000
allintext: not in the top 1000
The #27 group
allinanchor: #37
allintitle: >300 (didn't check down to 1000)
allintext: #32
The #40/41 group
allinanchor: #61
allintitle: #57
allintext: #58
The #22 group is ranked higher than the #27 group even though its allinanchor: is ranked lower, and it isn't even in the top 1000 for allintitle: and allintext:. Also, the allinanchor:, allintitle: and allintext: rankings are very different across the groups - so different that it doesn't make sense unless different algorithms are being used.
I'm now inclined to think that Google really does use different algorithms randomly - the random part being which datacenter provides the surfer with the results at any given time, and they change all the time. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained |
|
| Back to top |
|
 |
Mel
Site Admin

Joined: 03 Sep 2003
Posts: 9060
|
Posted: Thu Feb 17, 2005 2:33 pm
|
|
| Post subject: |
|
|
That would be a strange way to run a search engine in that it would be throwing relevancy out the window most of the time.
It coud also be a temporary condition given that the recent update has dragged on for some time.
I am more inclinced to believe that the allinanchor: search has been "fixed" just like the Google Link: search and posted to that effect some time before the recent update.and the same thing may be true of the allintitle and allintext searches, after all they like the link search are of no real use to anyone besides Webmasters.
The same results you have reported would also be true if a different index was being used at each group of data centers. |
_________________ Expert SEO Services - Buy Cheap Used Cars |
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Thu Feb 17, 2005 2:44 pm
|
|
| Post subject: |
|
|
| Mel wrote: |
| The same results you have reported would also be true if a different index was being used at each group of data centers. |
True, but the top sites are the same through all the datacenters and I think they are pretty much the same basic index, but I can examine them all.
My inclination (imagination) is towards the idea that Google uses the various datacenter groups to try out different algorithms. To my imagination, the #40 group looks like normal Google, and the #22 group looks a bit like like a trial - that's assuming that the allinanchor: etc. are working as normal.
| Mel wrote: |
| That would be a strange way to run a search engine in that it would be throwing relevancy out the window most of the time. |
Not if the different algos produced decent relevancy. As it is, we do get results from the various datacenters, so, whatever the reason for the differences (different algos, mid-update, whatever), we are receiving them all whatever their relevancy is like.
Incidentally, I saw these big difference (#23 to #30+ to #40+) before the recent big changes (update). I didn't check all the datacenters at the time, but the differences were there. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained
Last edited by PhilC on Thu Feb 17, 2005 3:23 pm; edited 1 time in total |
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Thu Feb 17, 2005 2:49 pm
|
|
| Post subject: |
|
|
I'm just thinking back a bit.
Google have burned themselves in the past by rolling out overall algo changes; e.g. Florida, and that wasn't the only occassion. Since then they've added a load more datacenters, and it could be that they are avoiding burning themselves again by testing significant algo changes 'live' in datacenter groups, rather than install them on all datacenters simultaneously. Just a thought. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained |
|
| Back to top |
|
 |
aztrx
Intermediate member
Joined: 23 Sep 2004
Posts: 52
Location: KY
|
Posted: Thu Feb 17, 2005 8:19 pm
|
|
| Post subject: |
|
|
| Wouldn't these differences exist even if using the same algo but using a different timeslice. Example: datacenter A does a full spider/update then datacenter B begins update. Even a 12 hour lag in time could result in subtle differences you have observed. Just a thought no supporting evidence. |
_________________ Adult Toys & Video |
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
|
| Back to top |
|
 |
WebTone
Advanced Member
Joined: 28 Jan 2005
Posts: 258
|
Posted: Thu Feb 17, 2005 11:51 pm
|
|
| Post subject: |
|
|
| Phil I'm not sure I understand fully what you mean above but I certainly get google bots from different IPs visiting my main sites in the same cycle and these are different geo locations according to sitestats. |
|
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Thu Feb 17, 2005 11:58 pm
|
|
| Post subject: |
|
|
| Yes it uses quite a few different IPs. What I meant was that Each datacneter, or group of datacenters, can't really do its own independant independant crawling, parsing and storing sytem or the various indexes would tend to become too dissimilar. I can't see Google doing that. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained |
|
| Back to top |
|
 |
yonnermark
Advanced Member

Joined: 14 Jul 2003
Posts: 1417
|
Posted: Fri Feb 18, 2005 3:45 pm
|
|
| Post subject: |
|
|
| your findings are plausible mainly because of what the Google chap said. He wouldn't just make stuff up for the sake of it... So I guess there's a lot of mileage in this idea |
_________________ Royton |
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Fri Feb 18, 2005 3:57 pm
|
|
| Post subject: |
|
|
The #22 and #27 groups merged yesterday, and became one larger #27 group. Also 3 datacenters crossed over to a different group. It's not compatible with my first thoughts - that different datacenters simply employ different algorithms.
If different algorithms are used, then more than one of them appear to reside in each datacenter. I do think that different algos are used though, because of the very distinct ranking groups. Ranking them like that must mean different algos or different indexes, and I don't care for the idea of different, automomous indexes - especially when 2 groups merged and 3 datacenters crossed over.
It could be thought that the #22 group's indexes were simply updated to be the same as the (newer?) #27 groups indexes, but that doesn't appear to be case because 1 datacenter moved the other way - from the #27 group to the ~#40 group. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained |
|
| Back to top |
|
 |
razvan
Advanced Member

Joined: 14 Mar 2004
Posts: 290
|
Posted: Fri Feb 18, 2005 7:29 pm
|
|
| Post subject: |
|
|
Hello,
A little off-topic. PhilC, what do you use to monitor so many datacenters. I have a tool that monitors at most ten.  |
_________________ | O-Zone | Travel to Romania | |
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Mon Feb 21, 2005 1:32 am
|
|
| Post subject: |
|
|
I've been watching 45 datacenters in the last few days and I've found that the results from some of them change. But they are not updating in the sense that they are having a new index uploaded to them, because they can return one set of results one minute, another set a minute later, and then back to the orginal set a minute after that.
I can think of 2 reasons for those changes. One is multiple algorithms. The other is that we don't always receive the results from the datacenter where we request them from - we are sometimes switched to a different datacenter.
To my mind, the second one has a ring of truth to it - and it makes some sense. Google normally shares the load of searches between datacenters, and we often see that in action between one page of results and the next. It also makes sense to share the load when a specific datacenter is being searched. Bearing in mind that Google is directing searches to the various DCs, at the time that we search a particular DC, it could well be fully loaded, and it might pass us on to another DC.
If that's happening, it means that, when we use tools to search specific DCs, we can't just assume that the results they show are from the DCs that we think they are from. And when a few DCs appear to be dancing they may not be dancing at all. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained |
|
| Back to top |
|
 |
Mel
Site Admin

Joined: 03 Sep 2003
Posts: 9060
|
Posted: Mon Feb 21, 2005 4:10 am
|
|
| Post subject: |
|
|
Yep the load balancing idea seems to make sense Phil and of course that has to be one of the reasons for having so many data centers.
But I think that Google is also still tweaking the results of the last update, which further complicates things, and they just might be doing one thing on one data center and another somewhere else. |
_________________ Expert SEO Services - Buy Cheap Used Cars |
|
| Back to top |
|
 |
PhilC
Site Admin

Joined: 21 Nov 2002
Posts: 13052
|
Posted: Mon Feb 21, 2005 1:03 pm
|
|
| Post subject: |
|
|
Switching us to another DC when we search a particular DC could also be because the one we search is stood down very briefly while some sort of update is being uploaded.
There is quite a bit of activity in the DCs, some of which I am putting down to being switched to other DCs, and some of it appears to be changes/updates. |
_________________ PhilC
Hidden Text
Search Engine Optimization articles and tools :: PageRank explained |
|
| Back to top |
|
 |
|
|
|
|