When using Guzzle to do web crawling/scraping. Handling cookies is a common challenge.
As we all know, an HTTP cookie (web cookie, browser cookie) is a small piece of data that a server sends to a user's web browser. The browser may store the cookie and send it back to the same server with later requests. This is typically used by a website to identify a user.
When crawling a website, we need to properly handle the cookie so that we can get the correct response.
To retrieve cookies from Guzzle is pretty straightforward, you simply call the getConfig('cookies')
method:
$client = new \GuzzleHttp\Client(['cookies' => true]);
$r = $client->request('GET', 'http://httpbin.org/cookies');
$cookieJar = $client->getConfig('cookies');
$cookieJar->toArray();
To send cookies with your request, you need to pass a cookie jar to the request:
Please note the cookies
option must be set to an instance of GuzzleHttp\Cookie\CookieJarInterface
.
Hope you find this tutorial useful. Sharing is caring, do share this tutorial if you have learned something from it.