Thursday, December 6, 2007

How does LinkedIn generates “people you may know” list?

My Google Analytics reports tell me that there is quite a significant number of people landing on my blog by searching this phrase in Google: How does LinkedIn generates “people you may know” list?. Well, I just mentioned this in passing in one of my earlier posts. That post, I don’t think, would have satisfied the Google searchers and with all probability would have left them with a serious urge to kick my and Google’s posterior.

So in order to save my ass (I don’t care what you do with Google’s), let me give this question a shot. Just to clarify I don’t have any inside information about the algorithm LinkedIn uses, this is merely my speculation.

Sometime back, one of my fellow networkers in LinkedIn asked the exact same question in the Q&A section. I, just like the United States marines, came to the rescue on that occasion. Here is my response to that question, produced verbatim for your benefit.

“If I had to develop this, I would use the following criteria to determine a potential connection for you

1) Attended same school/ universities (higher weight if you graduated in the same year),
2) Worked in the same firm (worked during the same period),
3) Number of shared connections,
4) Linkedin history of contacts between you and them
5) Contact list imported from outlook/ gmail etc. (These people were perhaps not part of linkedin when you imported the contact list; but they have joined the network since then).

Not sure what's the exact algorithm linkedin uses though.”


Let me add a couple of clarifications since I’ve taken the answer out of context. On point-4, I was referring to the people who might have applied to your job posting on LinkedIn in past or who would have privately replied to your questions and got a response in return.

Although I’m not sure what algorithm LinkedIn uses, I won’t be surprised if only thing it does is the last point I mentioned. This is a source of confusion to many people, when LinkedIn shows up the names from their address books in gmail/ outlook in the “people you may know” section.


Updated:


Looking back at my answer, I feel I can add a few more factors in the algorithm.

6) Look for members who have the same zip/area code as you. This has to be used in combination with other factors; such as employment history. For example, I definitely don't know everyone in Bangalore and may not know a lot of people in Yahoo!. But when you use my location and employment history in conjunction, chances are I would know the person who works in Yahoo, Bangalore.

7) This is similar to pint #4. LinkedIn should also look for other members activities to see if they would be a potential contact for you. If someone is visiting your profile frequently, chances are he might know you and can be considered as a potential contact for you (this is kind of back tracking). LinkedIn in recent time has started tracking this activity; although they don't always display the name of the person who visited your profile.

So we have all these factors that would help LinkedIn determine a potential connection for you. Again we can make a guess as to how LinkedIn would be doing the processing. They would perhaps start with your first level contacts and then traverse through your network graph calculating the "homophily" score between you and other members. The homophily (not to be confused with homophile)score would be a weighted sum of all these factors.

Once LinkedIn has identified all the factors, all they have to do is to keep tweaking the weights assigned to these for calculating the score. The obvious way to measure the effectiveness of this algorithm is to track how many times members follow these links and add those suggested "people you may know" to their network.

One subtle point is, to have a "wow" experience, members who are more degrees away from you might be ranked higher than someone who is already a 2nd or 3rd degree contact.

And before I close this topic, just to remind you is that this is my idea of how LinkedIn might be doing it. The actual implementation could be completely different.

So now that I’ve quenched your thirst for knowledge, you can perhaps go back to doing more productive work and searching for other important things in Google; say, for example, Lindsay Lohan pictures.

7 comments:

  1. A really interesting answer. Though I do not use Outlook and do not share any institution/company with some of the people shown on the page, and have communicated only by my Yahoo! account, I cannot say that others do not use Outlook.

    I would like to add the use of OpenSocial API for finding contacts.

    But I think, it is generally driven by people visiting your profile.

    -rupesh.

    ReplyDelete
  2. There is a tool on Linkedin that allows you to find more linkedin connection by using your contact/address book through gmail, AOL, Yahoo, hotmail, outlook, etc. The bad thing with doing this is that Linkedin actually searches through chains of emails. So, if I forward you a random email from someone that happens to have a linkedin account, it will recognize that person as "People you may know". I'm not for certain that this is the case, but I have a pretty good hunch. If this is the method for generating "People you may know", I think this is completely an invasion of privacy. I am giving Linkedin permission to search through my contacts, not my contacts contacts.

    ReplyDelete
  3. hey...the address book upload tool is used quite smartly by linkedin.

    One of the techniques I suspect they do is by suggesting people who have uploaded you in via there address book tool even if you have not.

    ReplyDelete
  4. Just wanted to comment on what anonymous said... I'm a product manager for LinkedIn and one of my product areas is the contact importers. I can personally assure that we do not search through emails. I agree wholeheartedly that that would be an invasion of privacy, so we just don't do it. In fact, the tools that we use to import don't even have that ability.

    Some email clients (gmail, for example) add anyone you've exchanged emails with to your address book. When we import the address book then, it contains all those contacts, but we never look at emails.

    Chris

    ReplyDelete
  5. Hey Chris. Yeah, I would have been surprised if Linked In did it that way. I've praised linked in for its privacy policy elsewhere in my blog.

    So how close was my algorithm to what you have actually implemented? :-)

    ReplyDelete
  6. Can we ask LinkedIn if they try to import your address book without the user manually doing this - i.e. if your email address is @gmail - and your password is the same on gmail as LinkedIn do they harvest your address book?... I suspect so... here is why:

    I have never used the import contacts from my gmail as I only use this address for personal stuff (not work) a such the contacts are not ones I want to expose to LinkedIn. I purchased an item off eBay from someone, the sellers email was in my contacts as I have replied to him vice/versa... Last week this seller came up in my "people you may know" - Aside from changing my LinkedIn password (done!) how else could this have taken place?

    ReplyDelete
  7. MH,
    Have you thought about the possibility that the eBay seller might have imported his contacts to Linked In? The fact that you were found in his contact, might have suggested Linked In that you two know each other.

    ReplyDelete