fbpx

Ever Google search for your own name? Even if you haven’t, there’s a good chance that a friend, family member or potential employer will at some point. And when they do, do you know everything that they’ll find?

Google is chock full of personal information you may not always want public. Whether it’s gathered by the search engine itself or scummy people-search websites, you have a right to know what kind of data other people can access when they look up your name. Tap or click here to see how to remove yourself from people search sites.

What others see about you online can mean the difference in landing a job or spending more time looking for one. If you want to take control of your reputation online, here’s why you need to start searching for yourself before others beat you to it.

Use exact phrases to find more than mentions

To get started with searching yourself on Google, it’s important to know how to search for exact phrases. This means telling Google you want to look up the words you typed exactly as you typed them — with no splitting terms or looking up one word while ignoring others.

To do this, simply search for your name (or any term) in quotation marks. As an example, look up “Kim Komando” and include quotation marks. Now, Google won’t show results for Kim Kardashian along with Komando.com.

Using exact phrases will weed out results for other people with similar names to yours. If you have a more common name, you may have to go through several pages before finding yourself.

If you aren’t finding anything or your name is very common, use your name plus modifiers like the city or state you live in, the names of your school(s), the name of the company you work for or other details. Make note of anything that you don’t feel comfortable with others finding and either write down the web addresses or bookmark them.

A picture says a thousand words

After you’ve saved the websites you want to go over, switch over to Google’s Image Search and scan through any pictures of you. It’s much easier to look through hundreds of images quickly versus hundreds of links, and you might be surprised at the images and websites you find.

If you find an image that concerns you, you can run a reverse image search to see where it’s hosted. To do this, follow these steps:

  • Open Google Image Search and click the Camera icon in the search bar
  • Paste a link to the image or upload the image you want to search for.
  • Your results will be shown as a combination of images and relevant websites. If an exact match is found, it will populate at the top of your results.

If the image has no text on it or any identifying information, don’t worry. Your image can turn up even if it only has your face.

Where you are and where you’ve been

Next, you’ll want to run a search for your past and current email addresses and phone numbers. This helps you see which sites have access to this personal data and will also show you what others can find if they look this information up.

 

If you’ve ever signed up for a discussion board or forum with your personal email address, your post history could easily show up if someone Googles you. The same can be said for social media pages and blogs. Find and make note of any posts or content that you’d prefer to make private.

Finally, run a search for your social media account usernames. Try to remember any usernames you may have used online and look those up. For example, if you search for the username “kimkomando,” you’ll turn up Kim’s Facebook, Twitter, Pinterest and Instagram accounts.

If you can’t remember, try searching for your name (as an exact phrase in quotation marks) plus the social network you want to look up. This might reveal accounts that you forgot about or that are less private than you think. If your real name is visible anywhere, it probably falls into this category.

Keep track going forward

If you want to stay on top of information that pops up about you on social media (or the rest of the web), you can set up a free Google Alert for your name. It’s an easy way to keep tabs on your online reputation.

Here’s how to set up a Google Alert for your name:

  • Visit Google.com/alerts and type what you want Google to alert you about in the search bar.
  • Click Show options to change settings for frequency, sources, language and region. You can also specify how many results you want and where you want them delivered.
  • Click Create Alert to start receiving alerts on yourself or other search topics you’re interested in.

Bonus: What does Google know about me?

And last but not least, let’s take a moment to address data that Google itself keeps on you. By default, Google records every search you enter, your location (if you use Google Maps), video-watching history and searches from YouTube, and much more.

Anyone who knows your Google Account email and digs deep enough can learn plenty about your online activities. If you haven’t visited your Google Account and privacy settings in a while, now’s the time to do it.

Now that you’ve searched for yourself and taken note of content that people can see if they look you up, it’s time to take things a step further and actually remove any data that you don’t want public. Want to know how? Just follow along for part two of our guide to Google-searching yourself.

[Source: This article was published in komando.com By KOMANDO STAFF - Uploaded by the Association Member: David J. Redcliff]

Categorized in Search Engine

Searching online has many educational benefits. For instance, one study found students who used advanced online search strategies also had higher grades at university.

But spending more time online does not guarantee better online skills. Instead, a student’s ability to successfully search online increases with guidance and explicit instruction.

Young people tend to assume they are already competent searchers. Their teachers and parents often assume this too. This assumption, and the misguided belief that searching always results in learning, means much classroom practice focuses on searching to learn, rarely on learning to search.

Many teachers don’t explictly teach students how to search online. Instead, students often teach themselves and are reluctant to ask for assistance. This does not result in students obtaining the skills they need.

 

For six years, I studied how young Australians use search engines. Both school students and home-schoolers (the nation’s fastest-growing educational cohort) showed some traits of online searching that aren’t beneficial. For instance, both groups spent greater time on irrelevant websites than relevant ones and regularly quit searches before finding their desired information.

Here are three things young people should keep in mind to get the full benefits of searching online.

1. Search for more than just isolated facts

Young people should explore, synthesise and question information on the internet, rather than just locating one thing and moving on.

Search engines offer endless educational opportunities but many students typically only search for isolated facts. This means they are no better off than they were 40 years ago with a print encyclopedia.

It’s important for searchers to use different keywords and queries, multiple sites and search tabs (such as news and images).

Part of my (as yet unpublished) PhD research involved observing young people and their parents using a search engine for 20 minutes. In one (typical) observation, a home-school family type “How many endangered Sumatran Tigers are there” into Google. They enter a single website where they read a single sentence.

The parent writes this “answer” down and they begin the next (unrelated) topic – growing seeds.

The student could have learned much more had they also searched for

  • where Sumatra is
  • why the tigers are endangered
  • how people can help them.

I searched Google using the keywords “Sumatran tigers” in quotation marks instead. The returned results offered me the ability to view National Geographic footage of the tigers and to chat live with an expert from the World Wide Fund for Nature (WWF) about them.

Clicking the “news” tab with this same query provided current media stories, including on two tigers coming to an Australian wildlife park and on the effect of palm oil on the species. Small changes to search techniques can make a big difference to the educational benefits made available online.

More can be learnt about Sumatran tigers with better search techniques. Source: Shutterstock

2. Slow down

All too often we presume search can be a fast process. The home-school families in my study spent 90 seconds or less, on average, viewing each website and searched a new topic every four minutes.

Searching so quickly can mean students don’t write effective search queries or get the information they need. They may also not have enough time to consider search results and evaluate websites for accuracy and relevance.

 

My research confirmed young searchers frequently click on only the most prominent links and first websites returned, possibly trying to save time. This is problematic given the commercial environment where such positions can be bought and given children tend to take the accuracy of everything online for granted.

Fast search is not always problematic. Quickly locating facts means students can spend time on more challenging educational follow-up tasks – like analysing or categorising the facts. But this is only true if they first persist until they find the right information.

3. You’re in charge of the search, not Google

Young searchers frequently rely on search tools like Google’s “Did you mean” function.

While students feel confident as searchers, my PhD research found they were more confident in Google itself. One Year Eight student explained: “I’m used to Google making the changes to look for me”.

Such attitudes can mean students dismiss relevant keywords by automatically agreeing with the (sometimes incorrect) auto-correct or going on irrelevant tangents unknowingly.

Teaching students to choose websites based on domain name extensions can also help ensure they are in charge, not the search engine. The easily purchasable “.com”, for example, denotes a commercial site while information on websites with a “.gov”(government) or “.edu” (education) domain name extension better assure quality information.

Search engines have great potential to provide new educational benefits, but we should be cautious of presuming this potential is actually a guarantee.

[Source: This article was published in studyinternational.com By The Conversation - Uploaded by the Association Member: Bridget Miller]

Categorized in Search Techniques

Search Console Insights uses both Search Console and Google Analytics data in one view

After being under the radar for a couple of months, Google has confirmed the new Google Search Console Insights. Search Console Insights is a new view of your data specifically “tailored for content creators and publishers,” Google said. It can help content creators understand how audiences discover their site’s content and what resonates with their audiences.

Search Console Insights uses both Search Console and Google Analytics data in one view. Google announced the beta today on Twitter, saying, “Today we’re starting to roll out a new experience to beta testers: Search Console Insights. It’s a way to provide content creators with the data they need to make informed decisions and improve their content.”

Access Search Console Insights. If you’re participating in the closed beta, you can access Google Search Console Insights for some of the profiles you manage in Google Search Console at https://search.google.com/search-console/insights/about. There, you can learn more about this reporting tool and click on Open Search Console Insights to potentially access the report.

What it looks like. Here is a screenshot provided by Google:

img src="https://searchengineland.com/figz/wp-content/seloads/2020/08/google-search-console-beta-insights-378x600.jpg" alt="" data-lazy-srcset="https://searchengineland.com/figz/wp-content/seloads/2020/08/google-search-console-beta-insights-378x600.jpg 378w,

I uploaded a full-sized, but blurred out, screenshot over here.

What Search Console Insights tells you. Google said Search Console Insights can help content creators and publishers answer questions about their site’s content, such as:

  1. What are your best performing pieces of content?
  2. How are your new pieces of content performing?
  3. How do people discover your content across the web?
  4. What are your site’s top and trending queries on Google Search?
  5. What other sites and articles link to your site’s content and did you get any new links?

Can’t access Search Console Insights? If you do not have access to Google Search Console Insights, do not worry. It is still in beta and even though Google has publicly announced it, it is not yet available to everyone.

“It is a closed beta that is currently only available to a group of users that have already received an official email from us for a specific site. We hope to open it for more users and to allow the beta group users to add more sites to it over time — stay tuned for more news and updates about this in the future,” Google said.

Why we care. As we said before, “Having certain Google Analytics data in Search Console can offer a big convenience and also help you see your data in new ways.” This Search Console Insights dashboard gives you more views of your content performance since it now blends both Google Analytics and Google Search Console data into one.

[Source: This article was published in searchengineland.com By Barry Schwartz - Uploaded by the Association Member: Mercedes J. Steinman]

Categorized in Search Engine

Having a website has long been thought of as the key to doing business with the world. But most businesses aren’t global – they’re local – so ranking on page one of Google globally isn’t nearly as valuable as having a strong strategy for local search engine optimization (SEO).

Local SEO is a catch-all for various digital practices designed to help your website rank higher in local searches, which increases your chances of being found by internet users more likely to actually patronize your business. Here are a few simple ways to instantly improve your company’s local search visibility:

Google My Business
Google is the search giant, and Google My Business (GMB) is its most powerful local tool. To start, log into (or create) your free Google account, visit GMB, add all of your locations, verify them and share photos. GMB also allows for customer reviews, so ask for them! This will help build your reputation and, in the eyes of Google’s algorithm, your chances of being listed higher in local search results for relevant terms.

Localize Your Website
Simple things like adding your city and state to the title of your website pages can make an impact. Instead of just “Tom’s Flower Shop,” titling your site something like “Tom’s Flower Shop | Florist in Miami, FL” helps search engines locate you. Then, go through your whole site and see where you can add localization language. There are probably dozens of opportunities to mention the city and neighborhoods you serve.

Get into Local Directories
Google actually refers to Yelp and other localized directories to assess how important your business is to the local area. Ask your web firm to research which online business directories are popular with local users and make sure your business info is listed completely and accurately on all of those websites.

Get Local on Social
Creating and maintaining social media pages can help localize your business, but you need to go beyond that and engage locals online. People talk about business, new developments and products on Twitter, Instagram, Facebook and more, and these social mentions are picked up by Google. If a lot of people talk about your business and/or link to your website, search engines will assume that you are relevant.

This is just an intro to the practice of local SEO. If you’re struggling to get on people’s radar, get in touch with Brand Poets. We can point you in the right direction.

[Source: This article was published in communitynewspapers.com By Tana Llinas- Uploaded by the Association Member: Anthony Frank]

Categorized in Search Engine

Microsoft Word will have a new search experience offering more tools, such as questions, understanding in-text errors, and synonyms.

Microsoft says it wants users of Microsoft Word to have a more robust search experience when using the Office application. Specifically, a major overhaul of the app’s integrated search tool is in the works. Microsoft plans to draw more parallels with its web search experience.

When people use Microsoft Word in the future, their searchers will be handled more like a typical search performed on a web browser. According to Microsoft, users will be able to surface search results when they make an in-text error, like a typo.

Word will help by searching other related items. This behavior is similar to web search engines and is not currently available on the app. Next, Microsoft Word search will also group together forms of words.

Improving Search

When users input a search term, the app will also find other words related to the search term. Synonyms will allow users to see more results, and it will also work for multi-word searches.

“We’re utilizing well-established web search technologies, such as query and document understanding, and adding deep learning based natural language models. This allows us to handle a much broader set of search queries beyond exact match,” Microsoft says.

Microsoft also wants Word’s upcoming search update to allow users to input questions as search terms.

“With the recent breakthroughs in deep learning techniques, you can now go beyond the common search term-based queries. The result is answers to your questions based on the document content. This opens a whole new way of finding knowledge. When you’re looking at a water quality report, you can answer questions like ‘where does the city water originate from? How to reduce the amount of lead in water?’” Microsoft explains.

At the moment, all the planned changes are named as “coming soon” with no word on a specific launch date.

 [Source: This article was published in winbuzzer.com By Luke Jones - Uploaded by the Association Member: James Gill] 

Categorized in Internet Search

New data reveals the top searches performed on YouTube this year, along with the most popular channels.

The top 100 YouTube search queries of the year are revealed in a study examining the search volume of over 800 million keywords.

YouTube does not provide this data officially, but Ahrefs compiles a report each year based on data in its Keyword Explorer tool.

Top queries in the report are broken down by searches in the US and searches performed worldwide.

First let’s take a look at top US searches.

Top YouTube Searches in the US

These are the top 20 searches on YouTube in the United States. For a complete list of top 100 queries, see the original report.

 

Top 20 US Queries (& Search Volume)

  1. pewdiepie (3,770,000)
  2. asmr (3,230,000)
  3. music (2,670,000)
  4. markiplier (2,380,000)
  5. old town road (2,040,000)
  6. pewdiepie vs t series (1,940,000)
  7. billie eilish (1,910,000)
  8. fortnite (1,630,000)
  9. david dobrik (1,610,000)
  10. jacksepticeye (1,580,000)
  11. james charles (1,560,000)
  12. joe rogan (1,560,000)
  13. baby shark (1,500,000)
  14. bts (1,350,000)
  15. dantdm (1,330,000)
  16. snl (1,260,000)
  17. game grumps (1,140,000)
  18. cnn (1,120,000)
  19. wwe (1,100,000)
  20. lofi (1,040,000)

Some observations:
One thing that’s clear when looking at this year’s top searches compared to last year’s is more people are turning to YouTube for music.

Almost a quarter of this year’s top 100 US searches are music related. The keyword “music” itself is the 3rd most searched term even.

Another auditory experience, ASMR, comes in at #2 which is down from last year’s top position.

Five of the top 10 searches are branded, which means people are searching directly for the names of channels and YouTube creators.

In fact, 50% of the top 100 searches are for specific YouTube personalities and channels.

At a glance it would appear gaming queries are still popular, but less so compared to last year.

That could be an indication Twitch is capturing more of the gaming audience.

Let’s see how these searches compare to the top worldwide searches.

Top YouTube Searches Worldwide

These are the top 20 searches on YouTube worldwide. For a complete list of top 100 queries, see the original report.

Top 20 Worldwide Queries (& Search Volume)

  1. bts (17,630,000)
  2. pewdiepie (16,320,000)
  3. asmr (13,910,000)
  4. billie eilish (13,860,000)
  5. baby shark (12,090,000)
  6. badabun (11,330,000)
  7. blackpink (10,390,000)
  8. old town road (10,150,000)
  9. music (9,670,000)
  10. peliculas completas en español (9,050,000)
  11. fortnite (9,010,000)
  12. pewdiepie vs t series (8,720,000)
  13. minecraft (8,560,000)
  14. senorita (8,290,000)
  15. ariana grande (7,890,000)
  16. alan walker (7,560,000)
  17. calma (7,390,000)
  18. tik tok (7,270,000)
  19. musica (7,140,000)
  20. bad bunny (7,040,000)

Some observations:
It appears the whole world is using YouTube more for music, as Ahrefs points out:

“Searches for artists, bands and songs dominate our list of the top 100 worldwide YouTube searches with a staggering 57/100 searches (almost ⅔) being music‐related.

So compared to the US, it seems that the rest of the World uses YouTube far more for music.”

The rest of the world isn’t as into branded content, however, as only two of the top 10 worldwide searches are branded.

Takeaways For Marketers

Perhaps the greatest takeaway for marketers is the insight into what users generally search for on YouTube.

People primarily turn to YouTube search for: music, gaming, branded content, and already-established YouTubers.

That presents a challenge when it comes to building an audience for smaller independent channels.

It’s not impossible though, as there are ways outside of search results to generate traffic to videos.

For instance, YouTube’s suggested videos are instrumental to the success of many channels’ content.

For more on how to success with YouTube’s recommendation algorithm, see these resources:

[Source: This article was published in searchenginejournal.com By Matt Southern - Uploaded by the Association Member: Anna K. Sasaki]

Categorized in Internet Search

According to Roy Amara’s oft-cited law, people tend to overestimate the impact of new technology in the short run, however, underestimate its impacts in the long run. This law appears to be specifically applicable to usage of voice as well as digital search.

Now, a recent survey from Perficient Digital indicates that voice may have plateaued. The agency has been performing surveys and asking more than 1,000 US adults about their usage of voice, virtual assistants, and voice search, for the last four years. The survey performed last year discovered that voice was second to the smartphone browser as the ‘first choice’ entry point for mobile search.

how-consumers-search.png

Most people (around 75 percent) said that they prefer to manually enter text into a search application, internet browser, or search bar on the smartphone. Thus, usage seems to be flat. According to the survey, usage of voice seems to be down for users at all education levels, although voice usage is positively correlated with education. Those people with more education tend to use voice more as compared to individuals with less education.



When the respondents were asked how often they use smart speakers to search for data, 56% said that they never use them or use them less than twice a week. 44% of respondents use smart speakers at least twice a week, and 20% and 44% said that they use them 6 to 9 times every week.

smart-speaker-search.png

User frustration with virtual assistants not being able to understand commands or questions explain this flat-to-declining use. Enhanced accuracy and improved comprehension may be able to generate extra usage frequency. Voice seems to be just an alternative input mechanism for text, however, it also represents a different user experience.

Behind the scenes, this technology is becoming more sophisticated. The survey also points out that voice is central for the majority of non-traditional connected gadgets. It has also been recorded that 77% of all internet-connected devices are other than a computer, tablet, or mobile.

 [Source: This article was published in digitalinformationworld.com By Arooj Ahmed - Uploaded by the Association Member: Anthony Frank]

Categorized in Internet Search

For eleven years, the search engine Ecosia has used most of the revenue from advertising on its website and app towards planting trees—and this month they planted their 100-millionth tree.

The German nonprofit, which became the first ‘B Corporation’ in that country because it was established for social good, has earned its founder Christian Kroll widespread praise—and one reason is that they claim to plant more native species than any other mass tree planting effort.

The phenomenon of mass tree planting began in the early 2000s when scientists began hypothesizing that the increase in CO2 emissions could be countered by replenishing the world’s forests. 

Since then, projects like Africa’s Great Green Wall (and China’s Green Great Wall) or dozens of others in Asia, like this man who planted an entire mangrove ecosystem, have seen billions of trees planted over the last two decades—although many died due to improper planting or post-planting management efforts.

Ecosia often targets countries that are the most biodiverse, where tree loss directly corresponds with species loss. This has caused them to launch projects in Nicaragua and Peru, Burkina Faso and Malawi, and Indonesia and Australia.

In 2018, for example, they created a tree nursery for 200,000 trees in Madagascar, to help create a forest corridor leading from an isolated habitat to the ocean. In 2019 they created a forest agriculture project in Borneo, to prevent locals selling the land to oil palm development.

Following the devastating fires in the Amazon, the number of people who had installed the Ecosia app doubled, allowing them to fund a 3 million tree-planting project in Brazil. In the wake of the Australian bushfires, Ecosia began restoring native forests there.

Just last year they celebrated their 50-million-tree milestone, having now doubled it in just one year’s time.

“100 million trees tackle the climate crisis by removing 1771 tonnes of CO2 every day, but it means so much more than that,” wrote Ecosia in their blog. “100 million trees means habitats for endangered animals. It means healthy rivers, more biodiversity, and fertile soil, and more fruits, nuts, and oils for local communities.”

Ecosia is a dream company for any environmentalist. Besides planting over 100 million trees, they have built their own solar power station—to energize 200% of all the power required to run their servers. They have also added little notes to their search results to let you see whether a company is tree/planet friendly, or whether they utilize a lot of fossil fuels.

They have also committed to never selling the company, so that no one will ever “become rich” from their efforts, except Mother Earth.

 

[Source: This article was published in goodnewsnetwork.org  - Uploaded by the Association Member: Alex Gray]

Categorized in Search Engine

“I just want search to work like it does on Amazon and Google.” I can’t tell you how many times I’ve heard that lament from friends, clients and other search folks. Frustration and dissatisfaction are common emotions when it comes to enterprise search — that is, search within the firewall.

Google on the web makes search look easy: you type in a word or two, and you get a list of dozens, if not hundreds of relevant pages. We’d all like search like that for our web and internal repositories too.

But remember that at one point, Google offered an enterprise solution in a box: the Google Search Appliance (GSA). It was a large yellow Google-branded Dell server that would crawl and index internal content, respect security and deliver pretty good results quickly. And the Google logo was available on every page to remind users they were using Google search.

The GSA was marketed to partners and corporations from 2004 through early 2019, when it was removed from the market. The GSA showed decent results, but they never lived up user expectations. What went wrong?

Several IT managers have told me users had anticipated the quality of results to be “just like Google” — but the GSA just didn’t live up to their expectations. One search manager told me that simply adding the GSA logo to their existing non-Google search platform reduced user complaints by 40%.

I’m not proposing that you find a ‘Powered by Google’ graphic and simply add it to your search form. First, that’s misleading; and probably a violation of Google’s intellectual property. And secondly, your users will react to the quality of the results, not the search page logo.

One school of thought was that Google simply decided to focus on their primary business, delivering high quality on the web. In fact, the GSA just didn’t have access to the magic that makes its web search so good: Metadata.

It turns out that internal enterprise search is hard.

Upgrade Your User Search Experience

Partly because of its size and popularity, Google on the web takes advantage of the context available to it. That means the results you see may include queries used and pages that you have viewed in the past. But what really adds value is that Google will also include post-query behavior of other Google users who performed the same query.

The good news is you can likely improve your internal search results by implementing the same approach Google uses on the public web.

Your internal content brings some challenges of its own. On the web, there are sometimes thousands of pages that are nearly identical: if Google web shows you any one of those near duplicates, you’ll probably be satisfied. But behind the firewall, people are typically looking for a single page; and if search can’t find it, users complain.

Internal search comes with its own challenges; but it also has metadata that can be used to improve results. 

Almost all of the internal content we’ve seen with clients is secure. While parts of some repositories — think HR — are available across the organization, HR does have secure content such as payroll data, employee reviews, etc. that must not be available to all.

 

The Solution: Use the Context!

One of the differences between internet and intranet content is security. And repositories fall into one of two general areas: users and content. Security should come into play for both types of content.

User Level Security

In a lot of enterprise environments, many, if not most, repositories apply user or content level security. And typically there are a number of elements here. The fields can be used to add useful metadata. Fields that are available and make sense to be included as user-level metadata may include the following:

Location: Office, Department, Time Zone

Office location, department and time zone

Direct phone & email

List of active clients

User Level Security

Location, role, title, office, department

Role & title

Manager name & contact info

Key accounts

Content Level Security

Access level

Content including queries, viewed results pages, and saved and/or rejected-ignored

Actually, this is really a starting data point; examine, experiment, and dive in!

[Source: This article was published in cmswire.co By Miles Kehoe - Uploaded by the Association Member: Dana W. Jimenez]

Categorized in Internet Search

Google has made some new substantial changes to their How Google Search Works” search documents for website owners. And as always when Google makes changes to important documents with impact on SEO, such as How Search Works and the Quality Rater Guidelines, there are some key insights SEOs can gleam from the new changes Google has made.

Of particular note, Google detailing how it views a “document” as potentially comprising of more than one webpage, what Google considers primary and secondary crawls, as well as an update to their reference of “more than 200 ranking factors” which has been present in this document since 2013.

But here are the changes and what they mean for SEOs.

Contents [hide]

  • 1 Crawling
    • 1.1 Improving Your Crawling
  • 2 The Long Version
  • 3 Crawling
    • 3.1 How does Google find a page?
    • 3.2 Improving Your Crawling
  • 4 Indexing
    • 4.1 Improving your Indexing
      • 4.1.1 What is a document?
  • 5 Serving Results
  • 6 Final Thoughts
      • 6.0.1 Jennifer Slegg
      • 6.0.2 Latest posts by Jennifer Slegg (see all)

Crawling

Google has greatly expanded this section.

They made a slight change to wording, with “some pages are known because Google has already crawled them before” changed to “some pages are known because Google has already visited them before.”   This is a fairly minor change, primarily because Google decided to include an expanded section detailing what crawling actually is.

Google removed:

This process of discovery is called crawling.

The removal of the crawling definition was simply because it was redundant.  In Google’s expanded crawling section, they included a much more detailed definition and description of crawling instead.

The added definition:

Once Google discovers a page URL, it visits, or crawls, the page to find out what’s on it. Google renders the page and analyzes both the text and non-text content and overall visual layout to decide where it should appear in Search results. The better that Google can understand your site, the better we can match it to people who are looking for your content.

There is still a great debate on how much page layout is taken into account.  There was the page layout algo that was released many years, in order to penalize content that was pushed well below the fold in order to increase the odds a visitor might click on an advertisement that appeared above the fold instead.  But with more traffic moving to mobile, and the addition of mobile first indexing, the importance of above and below the fold for on page layout seemingly was less important.

When it comes to page layout and mobile first, Google says:

Don’t let ads harm your mobile page ranking. Follow the Better Ads Standard when displaying ads on mobile devices. For example, ads at the top of the page can take up too much room on a mobile device, which is a bad user experience.

But in How Google Search Works, Google is specifically calling attention to the “overall visual layout” with “where it should appear in Search results.”

It also brings attention to “non-text” content.  While the most obvious of this refers to image content, the referral to it is quite open ended.  Could this refer to OCR as well, which we know Google has been dabbling in?

Improving Your Crawling

Under the “to improve your site crawling” section, Google has expanded this section significantly as well.

Google has added this point:

Verify that Google can reach the pages on your site, and that they look correct. Google accesses the web as an anonymous user (a user with no passwords or information). Google should also be able to see all the images and other elements of the page to be able to understand it correctly. You can do a quick check by typing your page URL in the Mobile-Friendly test tool.

This is a good point – so many new site owners end up accidentally blocking Googlebot from crawling or not realizing their site is set to be only viewable by logged in users only.  This makes it clear that site owners should try viewing their site without also being logged into it, to see if there are any unexpected accessibility or other issues that aren’t note when logged in as an admin or high level user.

Also recommending site owners check their site via the Mobile-Friendly testing tool is good, since even seasoned SEOs use the tool to quickly see if there are Googlebot specific issues with how Google is able to see, render and crawl a specific webpage – or a competitor’s page.

Google expanded their specific note about submitting a single page to the index.

If you’ve created or updated a single page, you can submit an individual URL to Google. To tell Google about many new or updated pages at once, use a sitemap.

Previously, it just mentioned submitting changes to a single page using the submit URL tool.  This just adds clarification to those who are newer to SEO that they do not need to submit every single new or updated pages to Google individually, but that using sitemaps is the best way to do that.  There have definitely been new site owners who add each page to Google using that tool because they don’t realize sitemaps is a thing.  But part of this is that WordPress is such a prevalent way to create a new website, yet it does not have native support for sitemaps (yet), so site owners need to either install a specific sitemaps plugin or use one of the many SEO tool plugins that offer sitemaps as a feature.

This new change also highlights using the tool for creating pages as well, instead of just the previous reference of “changes to a single page.”

Google has also made a change to the section about “if you ask Google to crawl only one page” section as well.  They are now referencing what Google views as a “small site” – according to Google,  a smaller site is one with less than 1,000 pages.

Google also stresses the importance of a strong navigation structure, even for sites it considers “small.”  It says site owners of small sites can just submit their homepage to Google, “provided that Google can reach all your other pages by following a path of links that start from your homepage.”

With so many sites being on WordPress, it is less likely that there will be random orphaned pages that are not accessible by following links from the homepage  But depending on the specific WordPress theme used, sometimes there can be orphaned pages from pages being added but not manually added to the pages menu… in these cases, if a sitemap is used as well, those pages shouldn’t be missed even if not directly linked from the homepage.

In the “get your page linked to by another page” section, Google has added that links in “advertisements links that you pay for in other sites, links in comments, or other links that don’t follow the Google Webmaster Guidelines won’t be followed by Google.”  A small change, but Google is making it clear that it is a Google specific thing that these links won’t be followed, but they might be followed by other search engines.

But perhaps the most telling part of this is at the end of the crawling section, Google adds:

Google doesn’t accept payment to crawl a site more frequently, or rank it higher. If anyone tells you otherwise, they’re wrong.

It has long been an issue with scammy SEO companies to guarantee first positioning on Google, to increase rankings or requiring payment to submit a site to Google.  And with the ambiguous Google Partner badge for AdWords, many use the Google Partners badge to imply  they are certified by Google for SEO and organic ranking purposes.  That said, most of those who are reading the How Search Works probably are already aware of this.  But nice to see Google add this in writing again, for times when SEOs need to prove to clients that there is not a “pay to win” option, outside of AdWords, or simply to show someone who might be falling for some scammy SEO company’s claims of Google rankings.

The Long Version

Google then gets into what they call the “long version” of How Google Search Works, with more details on the above sections, covering more nuances that impact SEO.

Crawling

Google has changed how they refer to the “algorithmic process”.  Previously, it stated “Googlebot uses an algorithmic process: computer programs determine which sites to crawl, how often and how many pages to fetch from each site.”  Curiously, they removed the reference to “computer programs”, which provoked the question about which computer programs exactly Google was using.

The new updated version simply states:

Googlebot uses an algorithmic process to determine which sites to crawl, how often, and how many pages to fetch from each site.

Google also updated the wording for the crawl process, changing that it is “augmented with sitemap data” to “augmented by sitemap” data.

Google also made a change where it referenced that Googlebot “detects” links and changed it to “finds” links, as well as changes from Googlebot visiting “each of these websites” to the much more specific “page”.  This second change makes it more accurate since Google visiting a website won’t necessarily mean it crawls all links on all pages.  The change to “page” makes it more accurate and specific for webmasters.

Previously it read:

As Googlebot visits each of these websites it detects links on each page and adds them to its list of pages to crawl.

Now it reads:

When Googlebot visits a page it finds links on the page and adds them to its list of pages to crawl.

Google has added a new section about using Chrome to crawl:

During the crawl, Google renders the page using a recent version of Chrome. As part of the rendering process, it runs any page scripts it finds. If your site uses dynamically-generated content, be sure that you follow the JavaScript SEO basics.

By referencing a recent version of Chrome, this addition is clarifying the change from last year where Googlebot was finally upgraded to the latest version of Chromium for crawling, an update from Google only crawling with Chrome 41 for years.

Google also notes it runs “any page scripts it finds,” and advises site owners to be aware of possible crawl issues as a result of using dynamically-generated content with the use of JavaScript, specifying that site owners should ensure they follow their JavaScript SEO basics.

Google also details the primary and secondary crawls, something that has garnered much confusion since Google revealed primary and secondary crawls, but Google’s details in this How Google Search Works documents detail it differently than how some SEOs previously interpreted it.

Here is the entire new section for primary and secondary crawls:

Primary crawl / secondary crawl

Google uses two different crawlers for crawling websites: a mobile crawler and a desktop crawler. Each crawler type simulates a user visiting your page with a device of that type.

Google uses one crawler type (mobile or desktop) as the primary crawler for your site. All pages on your site that are crawled by Google are crawled using the primary crawler. The primary crawler for all new websites is the mobile crawler.

In addition, Google recrawls a few pages on your site with the other crawler type (mobile or desktop). This is called the secondary crawl, and is done to see how well your site works with the other device type.

In this section, Google refers to primary and secondary crawls as being specific to their two crawlers – the mobile crawler and the desktop crawler.  Many SEOs think of primary and secondary crawling in reference to Googlebot making two passes over a page, where javascript is rendered on the secondary crawl.  So while Google clarifies their use of desktop and mobile Googlebots, the use of language here does cause confusion for those who use this to refer to the primary and secondary crawls for javascript purposes.  So to be clear, Google’s reference to their primary and secondary crawl has nothing to do with javascript rendering, but only to how they use both mobile and desktop Googlebots to crawl and check a page.

What Google is clarifying in this specific reference to primary and secondary crawl is that Google is using two crawlers – both mobile and desktop versions of Googlebot – and will crawl sites using a combination of both.

Google did specifically state that new websites are crawled with the mobile crawler in their Mobile-First Indexing Best Practices” document, as of July 2019.  But this is the first time it has made an appearance in their How Google Search Works document.

Google does go into more detail about how it uses both the desktop and mobile Googlebots, particularly for sites that are currently considered mobile first by Google.  It wasn’t clear just how much Google was checking desktop versions of sites if they were mobile first, and there have been some who have tried to take advantage of this by presenting a spammier version to desktop users, or in some cases completely different content.  But Google is confirming it is still checking the alternate version of the page with their crawlers.

So sites that are mobile first will see some of their pages crawled with the desktop crawler.  However, it still isn’t clear how Google handles cases where they are vastly different, especially when done for spam reasons, as there doesn’t seem to be any penalty for doing so, aside from a possible spam manual action if it is checked or a spam report is submitted.  And this would have been a perfect opportunity to be clearer about how Google will handle pages with vastly different content depending on whether it is viewed on desktop or on mobile.  Even in the mobile friendly documents, Google only warns about ranking differences if content is on the desktop version of the page but is missing on the mobile version of the page.

How does Google find a page?

Google has removed this section entirely from the new version of the document.

Here is what was included in it:

How does Google find a page?

Google uses many techniques to find a page, including:

  • Following links from other sites or pages
  • Reading sitemaps

It isn’t clear why Google removed this specifically.  It is slightly redundant, but it was missing the submitting a URL option as well.

Improving Your Crawling

Google makes the use of hreflang a bit clearer, especially for those who might just be learning what hreflang is and how it works by providing a bit more detail.

Formerly it said “Use hreflang to point to alternate language pages.”  Now it states “Use hreflang to point to alternate versions of your page in other languages.”

Not a huge change, but a bit clearer.

Google has also added two new points, providing more detail about ensuring Googlebot is able to access all the content on the page, not just the content (words) specifically.

First, Google added:

Be sure that Google can access the key pages, and also the important resources (images, CSS files, scripts) needed to render the page properly.

So Google is stressing about ensuring Google can access all the important content.  And it is also specifically calling attention to other types of elements on the page that Google wants to also have access to in order to properly crawl the page, including images, CSS and scripts.  For those webmasters who went through the whole “mobile first indexing” launch, they are fairly familiar with issues surrounding blocking files, especially CSS and scripts, something that some CMS had blocked Googlebot from crawling by default.

But for newer site owners, they might not realize this was possible, or that they might be doing it.  It would have been nice to see Google add specific information on how those newer to SEO can check for this, particularly for those who also might not be clear on what exactly “rendering” means.

Google also added:

Confirm that Google can access and render your page properly by running the URL Inspection tool on the live page.

Here Google does add specific information about using the URL Inspection tool in order to see what site owners are blocking or content that is causing issues when Google tries to render it.  I think these last two new points could have been combined, and made slightly clearer for how site owners can use the tool to check for all these issues.

Indexing

Google has made significant changes to this section as well. And Google starts off with making major changes to the first paragraph.  Here is the original version:

Googlebot processes each of the pages it crawls in order to compile a massive index of all the words it sees and their location on each page. In addition, we process information included in key content tags and attributes, such astags and alt attributes.

The updated version now reads:

Googlebot processes each page it crawls in order to understand the content of the page. This includes processing the textual content, key content tags and attributes, such astags and alt attributes, images, videos, and more.

Google no longer states it processes pages to “compile a massive index of all the words it sees and their location on each page.”  This was always a curious way for them to call attention to the fact they are simply indexing all words it comes across and their position on a page, when in reality it is a lot more complex than that.  So it definitely clears that up.

 

They have also added that they are processing “textual content” which is basically calling attention to the fact it indexes the words on the page, something that was assumed by everyone.  But it does differentiate between the new addition later in the paragraph regarding images, videos and more.

Previously, Google simply made reference to attributes such as title and alt tags and attributes.  But now it is getting more granular, specifically referring to “images, videos and more.”  However, this does mean Google is considering images, videos and “more” to understand the content on the page, which could affect rankings.

Improving your Indexing

Google changed “read our SEO guide for more tips” to “Read our basic SEO guide and advanced user guide for more tips.”

What is a document?

Google has added a massive section here called “What is a document?”  It talks specifically about how Google determines what is a document, but also includes details about how Google views multiple pages with identical content as a single document, even with different URLs, and how it determines canonicals.

First, here is the first part of this new section:

What is a “document”?

Internally, Google represents the web as an (enormous) set of documents. Each document represents one or more web pages. These pages are either identical or very similar, but are essentially the same content, reachable by different URLs. The different URLs in a document can lead to exactly the same page (for instance, example.com/dresses/summer/1234 and example.com?product=1234 might show the same page), or the same page with small variations intended for users on different devices (for example, example.com/mypage for desktop users and m.example.com/mypage for mobile users).

Google chooses one of the URLs in a document and defines it as the document’s canonical URL. The document’s canonical URL is the one that Google crawls and indexes most often; the other URLs are considered duplicates or alternates, and may occasionally be crawled, or served according to the user request: for instance, if a document’s canonical URL is the mobile URL, Google will still probably serve the desktop (alternate) URL for users searching on desktop.

Most reports in Search Console attribute data to the document’s canonical URL. Some tools (such as the Inspect URL tool) support testing alternate URLs, but inspecting the canonical URL should provide information about the alternate URLs as well.

You can tell Google which URL you prefer to be canonical, but Google may choose a different canonical for various reasons.

So the tl:dr is that Google will view pages with identical  or near-identical content as the same document, regardless of how many of them there are.  For seasoned SEOs, we know this as internal duplicate content.

Google also states that when Google determines these duplicates, they may not be crawled as often.  This is important to note for site owners that are working to de-duplicate content which Google is considering duplicate.  So it would be more important to submit these URLs to be recrawled, or give those newly de-duplicated pages links from the homepage in order to ensure Google recrawls and indexed the new content, so Google de-dupes them properly.

It also brings up an important note about desktop versus mobile, that Google will still likely serve the desktop version of a page instead of the mobile version for desktop users, when a site has two different URLs for the same page where is designed for mobile users and the other for desktop.  While many websites have changed to serving the same URL and content for both using responsive design, some sites still run two completely different sites and URLs for desktop and mobile users.

Google also mentions that you can tell Google the URL you prefer Google to use as the canonical, but states they can chose a different URL “for various reasons.”  While Google doesn’t detail specifics about why Google might choose a different canonical than the one the site owner specifies, it is usually due to http vs https, if a page is included in a sitemap or not, page quality, if the pages appear to be completely different and should not be canonicalized, or due to significant incoming links to the non-canonical URL.

Google has also included definitions for many o the terms used by SEOs and in Google Search Console.

Document: A collection of similar pages. Has a canonical URL, and possibly alternate URLs, if your site has duplicate pages. URLs in the document can be from the same or different organization (the root domain, for example “google” in www.google.com). Google chooses the best URL to show in Search results according to the platform (mobile/desktop), user language‡ or location, and many other variables. Google discovers related pages on your site by organic crawling, or by site-implemented features such as redirects or tags. Related pages on other organizations can only be marked as alternates if explicitly coded by your site (through redirects or link tags).

Again, Google is talking about the fact a single document can encompass more than just a single URL, as Google will consider a single document to potentially have many duplicate or near duplicate pages as well as pages assigned via canonical.  Google makes specific mention about “alternates” that appear on other sites, that can only be considered alternates if the site owner specifically codes it.  And that Google will choose the best URL from within the collection of documents to show.

But it fails to mention that Google can consider pages duplicate on other sites and will not show those duplicates, even if they aren’t from the same sites, something that site owners see happen frequently when someone steals content and sometimes sees the stolen version ranking over the original.

There was a notation added for the above, dealing with hreflang.

Pages with the same content in different languages are stored in different documents that reference each other using hreflang tags; this is why it’s important to use hreflang tags for translated content.

Google shows that it doesn’t include identical content under the same “document” when it is simply in a different language, which is interesting.  But Google is tressing the importance of using hreflang in these cases.

URL: The URL used to reach a given piece of content on a site. The site might resolve different URLs to the same page.

Pretty self explanatory, although it does have reference to the fact different URLs can be resolved to the same page, presumably such as with redirects or alias.

Page: A given web page, reached by one or more URLs. There can be different versions of a page, depending on the user’s platform (mobile, desktop, tablet, and so on).

Also pretty self explanatory, bringing up the specifics that some site owners can be served different versions of the same page, such as if they try and view the same page on a mobile device versus a desktop computer.

Version: One variation of the page, typically categorized as “mobile,” “desktop,” and “AMP” (although AMP can itself have mobile and desktop versions). Each version can have a different URL (example.com vs m.example.com) or the same URL (if your site uses dynamic serving or responsive web design, the same URL can show different versions of the same page) depending on your site configuration. Language variations are not considered different versions, but different documents.

Simply clarifying with greater details the different versions of a page, and how Google typically categorizes them as “mobile,” “desktop,” and “AMP”.

Canonical page or URL: The URL that Google considers as most representative of the document. Google always crawls this URL; duplicate URLs in the document are occasionally crawled as well.

Google states here again that non-canonical pages are not crawled as often as the main canonical that a site owner assigns to a group of pages they want canonical.  Google does not include specific mention here that they sometimes chose a different page as the canonical one, even if there is a specific page designated as the canonical one.

Alternate/duplicate page or URL: The document URL that Google might occasionally crawl. Google also serves these URLs if they are appropriate to the user and request (for example, an alternate URL for desktop users will be served for desktop requests rather than a canonical mobile URL).

The key takeaway here is that Google “might” occasionally crawl the site’s duplicate or alternative page.  And here they stress that Google will serve these alternative URLs “if they are appropriate.”  It is unfortunate they don’t go into greater detail in why they might serve these pages instead of the canonical, outside of the mention of desktop versus mobile, as we have seen many cases where Google picks a different page to show other than the canonical for a myriad of reasons.

Google also fails to mention how this impacts duplicate content found on other sites, we we do know Google will crawl those less often as well.

Site: Usually used as a synonym for a website (a conceptually related set of web pages), but sometimes used as a synonym for a Search Console property, although a property can actually be defined as only part of a site. A site can span subdomains (and even domains, for properly linked AMP pages).

Interesting to note here what they consider a website – a conceptually related set of webpages – and how it related to the usage of a Google Search Console property, as “a property can actually be defined as only part of a site.”

Google does make mention that AMP, which technically appear on a different domain, are considered part of the main site.

Serving Results

Google has made a pretty interesting specific change here in regards to their ranking factors.  Previously, Google stated:

Relevancy is determined by over 200 factors, and we always work on improving our algorithm.

Google has now updated this “over 200 factors” with a less specific one.

Relevancy is determined by hundreds of factors, and we always work on improving our algorithm.

The 200 factors in the How Google Search Works dates back to 2013 when the document was launched, although then it also made reference to PageRank (“Relevancy is determined by over 200 factors, one of which is the PageRank for a given page”) which Google removed when they redesigned their document in 2018.

While Google doesn’t go into specifics on the number anymore, it can be assumed that a significant number of ranking factors have been added since 2013 when this was first claimed in this document.  But I am sure some SEOs will be disappointed we don’t get a brand new shiny number like “over 500” ranking factors that SEOs can obsess about.

Final Thoughts

There are some pretty significant changes made to this document that SEOs can get a bit of insight from.

Google’s description of what it considers a document and how it relates to other identical or near-identical pages on a site is interesting, as well as Google’s crawling behavior towards the pages within a document it considers as alternate pages.  While this behavior has often been noted, it is more concrete information on how site owners should handle these duplicate and near-duplicate pages, particularly when they are trying to un-duplicate those pages and see them crawled and indexed as their own document.

They added a lot of useful advice for newer site owners, which is particularly helpful with so many new websites coming online this year due to the global pandemic.  Things such as checking a site without being logged in, how to submit both pages and sites to Google, etc.

The mention of what Google considers a “small site” is interesting because it gives a more concrete reference point for how Google sees large versus small sites.  For some, a small site could mean under 30 pages and the idea of a site with millions of pages being unfathomable.  And the reinforcement of a strong navigation, even for “small sites” is useful for showing site owners and clients who might push for navigation that is more aesthetic than practical for both usability and SEO.

The primary and secondary crawl additions will probably cause some confusion for those who think of primary and secondary in terms of how Google processes scripts on a page when it crawls it.  But it is nice to have more concrete information on how and when Google will crawl using the alternate version of Googlebot for sites that are usually crawled with either the mobile Googlebot or the desktop one.

Lastly, the change from the “200 ranking factors” to a less specific, but presumably much higher number of ranking factors will disappoint some SEOs who liked having some kind of specific number of potential ranking factors to work out.

[Source: This article was published in thesempost.com By JENNIFER SLEGG - Uploaded by the Association Member: Barbara larson]

Categorized in Search Techniques
Page 1 of 3

airs logo

Association of Internet Research Specialists is the world's leading community for the Internet Research Specialist and provide a Unified Platform that delivers, Education, Training and Certification for Online Research.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.

Follow Us on Social Media