• December 22, 2024

Google Search Scraper Nodejs

google-search-scraper – npm

Google search scraper with captcha solving support
This module allows google search results extraction in a simple yet flexible way, and handles captcha solving transparently (through external services or your own hand-made solver).
Out of the box you can target a specific google search host, specify a language and limit search results returned. Extending these defaults with custom URL params is supported through options.
A word of warning: This code is intented for educational and research use only. Use responsibly.
Installation
$ npm install google-search-scraperExamples
Grab first 10 results for ‘nodejs’
var scraper = require(‘google-search-scraper’);var options = { query: ‘nodejs’, limit: 10};(options, function(err, url, meta) { if(err) throw err; (url); (); (); ()});Various options combined
var scraper = require(‘google-search-scraper’);var options = { query: ‘grenouille’, host: ”, lang: ‘fr’, age: ‘d1’, limit: 10, params: {}};(options, function(err, url) { if(err) throw err; (url)});Extract all results on edu sites for “information theory” and solve captchas along the way
var scraper = require(‘google-search-scraper’);var DeathByCaptcha = require(‘deathbycaptcha’);var dbc = new DeathByCaptcha(‘username’, ‘password’);var options = { query: ‘site:edu “information theory”‘, age: ‘y’, solver: dbc};(options, function(err, url) { if(err) throw err; (url)});You can easily plug your own solver, implementing a solve method with the following signature:
var customSolver = { solve: function(imageData, callback) { var id = null; callback(err, id, solutionText);}};
How to scrape Google Search organic results with Node.js?

How to scrape Google Search organic results with Node.js?

Intro
I would like to tell you how to scrape Google Search organic results with
Preparation
First, we need to create a project and add npm packages “Axios” and “Cheerio”. To do this, in the directory with our project create file, open the command line and enter:
npm init -y
then enter:
npm i axios cheerio
What will be scraped
Process
The following GIF shows the process of selecting Link, Title and Snippet CSS selectors using SelectorGadget Chrome extension.
Code
const cheerio = require(“cheerio”);
const axios = require(“axios”);
const searchString = “google”;
const encodedString = encodeURI(searchString);
const AXIOS_OPTIONS = {
headers: {
“User-Agent”:
“Mozilla/5. 0 (Windows NT 10. 0; Win64; x64) AppleWebKit/537. 36 (KHTML, like Gecko) Chrome/74. 0. 3729. 157 Safari/537. 36″, }, };
function getOrganicResults() {
return axios
(
`{encodedString}&hl=en&gl=us`,
AXIOS_OPTIONS)
(function ({ data}) {
let $ = (data);
const links = [];
const titles = [];
const snippets = [];
$(” > a”)((i, el) => {
links[i] = $(el)(“href”);});
$(” > a > h3″)((i, el) => {
titles[i] = $(el)();});
$(“”)((i, el) => {
snippets[i] = $(el)()();});
const result = [];
for (let i = 0; i <; i++) { result[i] = { link: links[i], title: titles[i], snippet: snippets[i], };} (result);});} getOrganicResults(); Enter fullscreen mode Exit fullscreen mode Output [ { link: '', title: 'Google', snippet: "Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking... "}, title: 'The Keyword | Google', snippet: 'Discover all the latest about our products, technology, and Google culture on our official blog. '}, title: "Browse All of Google's Products & Services - Google", snippet: 'Browse a list of Google products designed to help you work and play, stay organized, get answers, keep in touch, grow your business, and more. '}, title: 'Google - About Google, Our Culture & Company News', snippet: 'Stay up to date with Google company news and products. Discover stories about our culture, philosophy, and how Google technology is impacting others. '}, title: 'Google - Home | Facebook', snippet: 'Google, Mountain View, CA. 28151297 likes · 25276... Google, profile picture. Google is on Facebook. To connect with Google, log in or create an account. '}] Using Google Search Organic Results API SerpApi is a paid API with a free trial of 5, 000 searches. The difference is that all that needs to be done is just to iterate over a ready made, structured JSON instead of coding everything from scratch, and selecting correct selectors which could be time consuming at times. const SerpApi = require('google-search-results-nodejs'); const search = new gleSearch("YOUR_SECRET_KEY"); //To get the key, register on const params = { engine: "google", q: "google", location: "Austin, Texas, United States", google_domain: "", gl: "us", hl: "en"}; const callback = function(data) { (anic_results);}; (params, callback); organic_results: [ position: 1, title: "Google", link: ", displayed_link: ", snippet: "Search the world's information, including webpages, images, videos and more. ", sitelinks: { expanded: [ title: "Account", "You're never more than a tap away from your data and settings. Just... ", }, title: "Google Maps", "Get real-time navigation and more in the Maps app. Stay on web... ", }, title: "Images", snippet: "Google Images. The most comprehensive image search... ", }, title: "My Business", "Your free Business Profile on Google My Business helps you... ", }, title: "Videos", snippet: "AllImages · Sign in. Videos. REPORT THIS. CANCEL. OK... ", }, title: "Hangouts", "Use Google Hangouts to keep in touch with one person or a... ", }, ], }, }, position: 2, title: "The Keyword | Google", "Discover all the latest about our products, technology, and Google culture on our official blog. ", cached_page_link: ", related_pages_link: ", }, ], Links Code in the online IDE • SerpApi Playground Outro If you want to see how to scrape something using that I didn't write about yet or you want to see some project made with SerpApi, please write me a message. Scraping content from Google search results with request in ...

Scraping content from Google search results with request in …

For my app I need to get the first page of Google search results but from the domain because I need the “People also search for” knowledge graph info, which only shows up on
I figured I can use the request and cheerio modules to scrap content from Google’s search results page, but when I try to access the URL I need, i. e. Google automatically redirects me to the domain (as I’m based in Germany).
I tried setting it to first load url which automatically switches off country-specific redirect in browsers, but it didn’t work…
Does anybody know what I could do differently to make it work?
Here’s my code… Thank you!
var request = require(“request”);
var cheerio = require(“cheerio”);
function dataCookieToString(dataCookie) {
var t = “”;
for (var x = 0; x <; x++) { t += ((t! = "")? "; ": "") + dataCookie[x] + "=" + dataCookie[x];} return t;} function mkdataCookie(cookie) { var t, j; cookie = String(). replace(/, ([^])/g, ", [12], $1")(", [12], "); cookie[x] = cookie[x]("; "); j = cookie[x][0]("="); t = { key: j[0], value: j[1]}; for (var i = 1; i < cookie[x]; i++) { j = cookie[x][i]("="); t[j[0]] = j[1];} cookie[x] = t;} return cookie;} var dataCookie = mkdataCookie('MC_STORE_ID=66860; expires=' + new Date(new Date(). getTime() + 86409000)); request({ uri: ", headers: { 'User-Agent': 'Mozilla/5. 0', "Cookie": dataCookieToString(dataCookie)}}, function(error, response, body) { 'User-Agent': 'Mozilla/5. 0'}}, function(error, response, body) { (body); var $ = (body); $("")(function() { var link = $(this); var text = (); (text);});});});

Frequently Asked Questions about google search scraper nodejs

Leave a Reply