OF PIGEON, PANDA, PENGUIN, AND HUMMINGBIRD
We explore the solid foundations and some rather intricate pillars that make up the most ubiquitous and pervasive and search engine in the world.
Out of some obscure sense of political (technological?) correctness and a slightly comical spirit of inclusiveness, we tend to use the broad term "search engine" when we in fact have in mind Google the search and advertising behemoth that has broadened its horizons (and, dare we say, invaded ours rather aggressively) in the past decade and a half This chapter will talk about Google's Page Rank algorithm that ranks its search results, the updates to it, and how they have improved the quality of results that it churns out.
Google's Page Rank translates simple and intuitive principles into rigorous and efficient math to give you high quality, relevant results. A high ranking page implies more relevant content and less annoying spam. If a high ranking page links to another web page, it's a fair assumption that the linked important and reliable as well. Page Rank does consider it so and is therefore recursive in that sense. The second factor that determines the rank of a page is its actual content its text, how it's formatted, how it looks, and what to contain.
Judging a book by its cover
The bare-bones Page Rank Brin and Larry Page in 1998 ha received several major updates over the gs like making As years Page Rank algorithm today are significantly more complex than a result, the first rudimentary methodology laid down by Brin and Page. However, the basic idea remains the same. Page Rank works by representing the entire internet a graph. Mathematically, the graph's nodes represent the web pages, and an edge between two nodes in the graph represents a link between those two pages.
Naturally, these graphs will be directed, e, the edges will have a direction specified because a page may link to, but may or may not be linked by another page. The graph is then translated into a square matrix (N x N) of gigantic dimensions since each row (column) represents links from (to) each page. For the entire internet, Easily approaches tens of billions The ranks of all the pages are computed simultaneously by finding the eigenvector of the matrix --aspecific "direction" in the matrix which, when tweaked, leaves the other "directions" unchanged. So for a pair of pages, the one with the higher rank will be displayed higher up in the search results.
The crucial step, and also the one most mathematically and computationally intensive, is an eigenvector. Fortunately, since it's not common for websites to randomly link to unrelated web pages, a huge majority of the matrix elements are O, simplifying the calculations little. It should also be no surprise that this step is the one that has been innovated upon the most. After Page and Brin's "power method", the revisions in the PageRank computations have included the accelerated power method, aggregation/disaggregation methods, Krylov methods and Schwarz methods, spanning a total of seven years starting from 1999
Building a better mousetrap
Starting with 2011, each year has seen one major headline update to the software, respectively codenamed Google Panda, Google Penguin, Google Hummingbird, and Google Pigeon. Panda focused on downgrading "low-quality" or "thin" websites, and a result news sites and social media websites gained prominence in Google's search results. Penguin was targeted at websites that used search engine optimization (SEO)techniques that amounted to spamming link manipulation to make the website seem more important than its content deserves to be.
Hummingbird was arguably the quantum leap in terms of making Google's search results "smarter" because in addition to considering synonyms, it now also emphasized the context of each query. Michelle Hill marketing manager at Vertical Leap, explains it as a need to "think about his people are looking for something" instead of simply looking at what they typed in as their query Pigeon's aim was to work in tandem with Google Maps and give more importance to search results geographically closer to the user.
Naturally, this also impacted Google Maps search results. Searching for a grocery store near you is now more to throw up Old Uncle Jimmy's store on the corner of 14th street than it is to point you to the Big Bazaar at the end of the busiest lane in the center of the city
OC master race
Panda itself was updated several times from February to April 2011, which point it was made operational. It uses the number of links to a site and the number of queries about the site's brand to determine a ratio which in turn determines each individual page's factor. In May 2011, Google's webmaster blog published a guided building high-quality websites.
It gave out a general idea of what Google thought about while trying to downgrade unhelpful websites. Trustworthiness both for the information given and taken by the website is paramount.
Google also considers whether the article is written by someone with in-depth knowledge of the topic and whether it is merely rehashed content with synonyms thrown in for fluff. eventually to nobody's benefit. In 2012, the blog outlined what it termed "black hat" SEO techniques aimed purely at getting a higher ranking and not actually helping users, these practices include keyword stuffing and paying for links and do not contribute to the user experience in any way.
Nor do they make the website easier to index. Google doesn't like that one bit, so it decides to take a kitchen knife to such websites and ensures that their ranking decreases.
Reading between the lines
Google Hummingbird named so because of the need to be "precise and fast" represents a monumental change in the search engine's mechanisms. Riding on the back of its past updates Panda and Penguin, Hummingbird enabled Google to understand the context behind queries, and therefore looked the actual intent rather than simply
It should be noted that this doesn't merely consider synonyms, it already used to do anyway, but also considers neighbouring terms in the search as well as the immediate past queries For example, once a search for "Terminator lead actor" returns Arnold Schwarzenegger to you, you can follow up with "How tall is he" and Google will understand that you're referring to what you just found out. Similarly, you could work with your favorite sports team, looking "FC Barcelona fixtures" first, and then ask for their top goalscorer. It's important to remember that while this integration of natural language processing (NLP) did produce more relevant contextual results, it did not impact existing Eo techniques being used as such. Indeed, this was taken up because there was a shift in the kind of queries being asked. Instead of people asking for "Walmart stock price", people are asking
Google to "navigate to the nearest Walmart" instead, because they're doing so not at their desktops at home, but walking down the street speaking these queries into their smartphones. And Google has coped with this magnificently. "okay Google, walk to the nearest metro station," I asked, and was immediately directed to the correct one out of the three fairly close options I had, the maximum difference being a couple of hundred meters or so. It was a pleasant surprise. With Google's flagship, the Nexus 6P clearly focusing on aggressive battery management and having a larger capacity overall, the company is making a life for a lot of power users a breeze.
The more bought into the Android/Google ecosystem you are, the more seamless everything feels. Continuity and reliability were never an issue. In fact, in the past few years, with the advent of the wonderfully expressive and clean material design philosophy and a largely consistent and powerful NLP mechanism running behind the scenes of the classic white Google search bar, it has become a more personalised and, dare we say, more intuitive search engine than ever before. of course, Google emphasis on original and helpful content has always remained its core advice to all websites, and Hummingbird was not focused on ranking down poor websites as such, instead was more about walking the shoes of the users to get at the meaning behind their questions.
Caught up in semantics: "Things, not strings"
2012, Google introduced a system of information organization that they call their "Knowledge Graph" to the world. Searching can be thought of in two ways. One way is to ask where certain words can be found in a really
large place. This way assumes that you will find your answer in a fixed place and that it's only a question of looking as fast as possible through as much possible before you inevitably get to your solution.
The second way is to feed a few words to the search engine, which then relates those words to as many things as possible this may even be based on your search history, location, and other information about you and based on which relations seem to make the most sense, the engine throws few results at you. The combination of these results eventually gives you a clear picture of what you wanted to know about. The latter is what is known as "semantic search".
It's a concept that most search engines employ these days with the aim of prioritizing context over direct meanings. It works by establishing relationships between entities about which information is available and using different weights for each edge of the graph, the most relevant information is identified and presented to you first. Google's Google blog post dated 16th May 2012 explains this with the example of the Taj. You may be asking about the great Mughal monument, the chain of luxury hotels or a local place whose biryani you just cannot get enough of Knowledge Graph is, of course, a graph.
Every bit of information that Google possesses is laid out in the form of a graph everything is connected. If your request was to the biryani place the previous night, Google will most likely assume that a query for Tajis currently more relevant to the eatery than the monument. The impact of Knowledge Graph can be seen in the right pane of Google's search results. The brief summary, important details and related links like the "People also search for" section are all made possible by the contextual understanding of incredible amounts of data at Google's disposal.
Really close, as the Pigeon flies
2014's headline update saw Google boosting local listings in its search rankings. Pigeon naturally works closely with Maps, taking into account the user's location and the distances of the listings from them. The ranking parameters affected by these factors were improved, therefore the closer a place is to you, the more its weight age will be in its page rank.
In addition, Google added support for colloquialisms in addition to the conventional terms for say, places, and food Moreover, it integrated Pigeon with its Knowledge Graph, along with spell-checking and synonyms and context, making it an extremely powerful tool to change the focus of searching from mere information traversal to a process of complex decision making.
Google also relaxed on its own reviews of places if the search queries specifically asked for online review websites such as Yelp and TripAdvisor, which reported an immediate impact in their search rankings when Pigeon was rolled out.



No comments:
Post a Comment