Scrape Amazon products

For your next project, you need to get the price of a product from Amazon, store it, or maybe compare it with another online store. In this tutorial, I will show you how to write PHP code to scrape websites for retail details. In the past, it was easy to scrape any website using PHP, Python, and more, but now online stores start to block scraping processes for a few reasons:

  1. Some websites do not believe in open data access.
  2. While scraping, a website gets too many requests too fast, so it can crash.
  3. Some websites profit from Ads and scraping cuts down those revenues.

To avoid this blockage, it is preferred to use a scraping tool that is able to avoid these blockages and return the needed data. I will be using Scraping-bot.io, which has a free plan for testing, and a competitive price plan. With Scraping-bot you can parse any online store on the web, real estate, and social media data also. You can check my article about scraping and the tool I prefer here.

Let’s start explaining how to scrape your first product:

Step 1: Create a free account on scraping-bot

Navigate to Scraping-bot website and create a free account here with no Credit Card required.

Step 2: Get your username and API key.

Once you register, you will be redirected to your dashboard where you can find the API Key which you will need in the code for authorization.

Scrape Amazon products

Step 3: Define authentication parameters

Once you register, you will be redirected to your dashboard where you can find the API Key which you will need in the code for authorization.

Step 4: Define the parameters you need to scrape. You can see the list of parameters here.

$url = "http://url_of_item.toparse";
//Ex: https://www.amazon.com/gp/product/B07NJPKZQG/
    
//Define the params you need. 
//Full list here:
//https://www.scraping-bot.io/web-scraping-documentation/#retail-api
$postParams = array(
    "url" => $url,
    "options" => array(
    "useChrome" => false, //set to 'true' if you want to use headless chrome for javascript rendering
    "premiumProxy" => true,  //set to 'true' if you want to use premium proxies Unblock Amazon,Google,Rakuten
    "proxyCountry" => null, //allows you to choose a country proxy (example: proxyCountry:"FR")
    "waitForNetworkRequests" => false //wait for most ajax requests to finish until returning the Html content (this option can only be used if useChrome is set to true),
    //this can slowdown or fail your scraping if some requests are never ending only use if really needed to get some price loaded asynchronously
    )
);

Step 5: Send the request using cURL

//Define the API Endpoint
$apiEndPoint = "http://api.scraping-bot.io/scrape/retail";
$json = json_encode($postParams);

//Initiate cURL to request data
$curl = curl_init();
curl_setopt_array($curl, array(
    CURLOPT_URL => $apiEndPoint,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
    CURLOPT_CUSTOMREQUEST => "POST",
    CURLOPT_POSTFIELDS => $json,
    CURLOPT_HTTPHEADER => array(
        "Authorization: Basic ".$auth,
        "Content-Type: application/json"
    ),
));
//Execute the API Call
$response = curl_exec($curl);

Step 6: Catch errors if any, and parse and store the details.

$err = curl_error($curl);
curl_close($curl);
//Catch error and handle it
if ($err) {
    echo "cURL Error #:" . $err;
} else {
    // Recieve the response and parse it
    $stuff = json_decode($response,true);
    $artist = $stuff["data"];
    $title = $artist['title']   ;
    $image   = $artist['image'];
    $price = $artist['price'];
    $currency = $artist['currency'];
    $color = $artist['color'];
    $siteURL = $artist['siteURL'];
    $tuple = array("title"=>$title, "image"=>$image, "price"=>$price, "currency"=>$currency, "color"=>$color, "siteURL"=>$siteURL);
    echo json_encode($tuple);
}

You can either declare the URL of the product in the code or place the code on a server and post the URL to it, in this case, the code will be an API to send back product details.

Please leave your feedback below, and feel free to ask anything. Good luck with your next project.