Google Search Scraper Nodejs
google-search-scraper – npm
Google search scraper with captcha solving support
This module allows google search results extraction in a simple yet flexible way, and handles captcha solving transparently (through external services or your own hand-made solver).
Out of the box you can target a specific google search host, specify a language and limit search results returned. Extending these defaults with custom URL params is supported through options.
A word of warning: This code is intented for educational and research use only. Use responsibly.
Installation
$ npm install google-search-scraperExamples
Grab first 10 results for ‘nodejs’
var scraper = require(‘google-search-scraper’);var options = { query: ‘nodejs’, limit: 10};(options, function(err, url, meta) { if(err) throw err; (url); (); (); ()});Various options combined
var scraper = require(‘google-search-scraper’);var options = { query: ‘grenouille’, host: ”, lang: ‘fr’, age: ‘d1’, limit: 10, params: {}};(options, function(err, url) { if(err) throw err; (url)});Extract all results on edu sites for “information theory” and solve captchas along the way
var scraper = require(‘google-search-scraper’);var DeathByCaptcha = require(‘deathbycaptcha’);var dbc = new DeathByCaptcha(‘username’, ‘password’);var options = { query: ‘site:edu “information theory”‘, age: ‘y’, solver: dbc};(options, function(err, url) { if(err) throw err; (url)});You can easily plug your own solver, implementing a solve method with the following signature:
var customSolver = { solve: function(imageData, callback) { var id = null; callback(err, id, solutionText);}};
How to scrape Google Search organic results with Node.js?
Intro
I would like to tell you how to scrape Google Search organic results with
Preparation
First, we need to create a project and add npm packages “Axios” and “Cheerio”. To do this, in the directory with our project create file, open the command line and enter:
npm init -y
then enter:
npm i axios cheerio
What will be scraped
Process
The following GIF shows the process of selecting Link, Title and Snippet CSS selectors using SelectorGadget Chrome extension.
Code
const cheerio = require(“cheerio”);
const axios = require(“axios”);
const searchString = “google”;
const encodedString = encodeURI(searchString);
const AXIOS_OPTIONS = {
headers: {
“User-Agent”:
“Mozilla/5. 0 (Windows NT 10. 0; Win64; x64) AppleWebKit/537. 36 (KHTML, like Gecko) Chrome/74. 0. 3729. 157 Safari/537. 36″, }, };
function getOrganicResults() {
return axios
(
`{encodedString}&hl=en&gl=us`,
AXIOS_OPTIONS)
(function ({ data}) {
let $ = (data);
const links = [];
const titles = [];
const snippets = [];
$(” > a”)((i, el) => {
links[i] = $(el)(“href”);});
$(” > a > h3″)((i, el) => {
titles[i] = $(el)();});
$(“”)((i, el) => {
snippets[i] = $(el)()();});
const result = [];
for (let i = 0; i <; i++) {
result[i] = {
link: links[i],
title: titles[i],
snippet: snippets[i], };}
(result);});}
getOrganicResults();
Enter fullscreen mode
Exit fullscreen mode
Output
[
{
link: '',
title: 'Google',
snippet: "Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking... "},
title: 'The Keyword | Google',
snippet: 'Discover all the latest about our products, technology, and Google culture on our official blog. '},
title: "Browse All of Google's Products & Services - Google",
snippet: 'Browse a list of Google products designed to help you work and play, stay organized, get answers, keep in touch, grow your business, and more. '},
title: 'Google - About Google, Our Culture & Company News',
snippet: 'Stay up to date with Google company news and products. Discover stories about our culture, philosophy, and how Google technology is impacting others. '},
title: 'Google - Home | Facebook',
snippet: 'Google, Mountain View, CA. 28151297 likes · 25276... Google, profile picture. Google is on Facebook. To connect with Google, log in or create an account. '}]
Using Google Search Organic Results API
SerpApi is a paid API with a free trial of 5, 000 searches.
The difference is that all that needs to be done is just to iterate over a ready made, structured JSON instead of coding everything from scratch, and selecting correct selectors which could be time consuming at times.
const SerpApi = require('google-search-results-nodejs');
const search = new gleSearch("YOUR_SECRET_KEY"); //To get the key, register on
const params = {
engine: "google",
q: "google",
location: "Austin, Texas, United States",
google_domain: "",
gl: "us",
hl: "en"};
const callback = function(data) {
(anic_results);};
(params, callback);
organic_results: [
position: 1,
title: "Google",
link: ",
displayed_link: ",
snippet:
"Search the world's information, including webpages, images, videos and more. ",
sitelinks: {
expanded: [
title: "Account",
"You're never more than a tap away from your data and settings. Just... ", },
title: "Google Maps",
"Get real-time navigation and more in the Maps app. Stay on web... ", },
title: "Images",
snippet: "Google Images. The most comprehensive image search... ", },
title: "My Business",
"Your free Business Profile on Google My Business helps you... ", },
title: "Videos",
snippet: "AllImages · Sign in. Videos. REPORT THIS. CANCEL. OK... ", },
title: "Hangouts",
"Use Google Hangouts to keep in touch with one person or a... ", }, ], }, },
position: 2,
title: "The Keyword | Google",
"Discover all the latest about our products, technology, and Google culture on our official blog. ",
cached_page_link:
",
related_pages_link:
", }, ],
Links
Code in the online IDE • SerpApi Playground
Outro
If you want to see how to scrape something using that I didn't write about yet or you want to see some project made with SerpApi, please write me a message.
Scraping content from Google search results with request in …
For my app I need to get the first page of Google search results but from the domain because I need the “People also search for” knowledge graph info, which only shows up on
I figured I can use the request and cheerio modules to scrap content from Google’s search results page, but when I try to access the URL I need, i. e. Google automatically redirects me to the domain (as I’m based in Germany).
I tried setting it to first load url which automatically switches off country-specific redirect in browsers, but it didn’t work…
Does anybody know what I could do differently to make it work?
Here’s my code… Thank you!
var request = require(“request”);
var cheerio = require(“cheerio”);
function dataCookieToString(dataCookie) {
var t = “”;
for (var x = 0; x <; x++) {
t += ((t! = "")? "; ": "") + dataCookie[x] + "=" + dataCookie[x];}
return t;}
function mkdataCookie(cookie) {
var t, j;
cookie = String(). replace(/, ([^])/g, ", [12], $1")(", [12], ");
cookie[x] = cookie[x]("; ");
j = cookie[x][0]("=");
t = {
key: j[0],
value: j[1]};
for (var i = 1; i < cookie[x]; i++) {
j = cookie[x][i]("=");
t[j[0]] = j[1];}
cookie[x] = t;}
return cookie;}
var dataCookie = mkdataCookie('MC_STORE_ID=66860; expires=' + new Date(new Date(). getTime() + 86409000));
request({
uri: ",
headers: {
'User-Agent': 'Mozilla/5. 0',
"Cookie": dataCookieToString(dataCookie)}}, function(error, response, body) {
'User-Agent': 'Mozilla/5. 0'}}, function(error, response, body) {
(body);
var $ = (body);
$("")(function() {
var link = $(this);
var text = ();
(text);});});});