[php] How can I send cookies using PHP curl in addition to CURLOPT_COOKIEFILE?

I am scraping some content from a website after a form submission. The problem is that the script is failing every now and then, say 2 times out of 5 the script fails. I am using php curl, COOKIEFILE and COOKIEJAR to handle the cookie. However when I observed the sent headers of my browser (when visiting the target website from my browser and using live http headers) and the headers sent by php and saw there are many differences.

My browser sent a lot more cookie variables than php curl. I think this difference might be because javascript is resposible for setting most of the cookies, however I'm not sure about this.

I am using the below code to do the scraping and I am showing the sent headers of my browser and of php curl:

$ckfile = tempnam ("/tmp", 'cookiename');

$url = 'https://www.domain.com/firststep';
$poststring = 'variable1=4&variable2=5';
$ch = curl_init ($url);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_POST, 1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $poststring);
$output = curl_exec ($ch);
curl_close($ch);



$url = 'https://www.domain.com/nextstep';
$poststring = 'variableB1=4&variableB2=5';
$ch = curl_init ($url);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_POST, 1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $poststring);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
$output = curl_exec ($ch);
$headers = curl_getinfo($ch, CURLINFO_HEADER_OUT);
curl_close($ch);

print_r($headers);

// Gives:
POST /d-cobs-web/doffers.html;jsessionid=7BC2A5277A4EB07D9A7237A707BE1366 HTTP/1.1
User-Agent: Mozilla
Host: domain.subdomain.nl
Accept: */*
Cookie: JSESSIONID=7BC2A5277A4EB07D9A7237A707BE1366; www-20480=MIFBNLFDFAAA
Content-Length: 187
Content-Type: application/x-www-form-urlencoded

// Where live http headers gives:
POST /d-cobs-web/doffers.html;jsessionid=7BC2A5277A4EB07D9A7237A707BE1366 HTTP/1.1
Host: domain.subdomain.nl
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:21.0) Gecko/20100101 Firefox/21.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: nl,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Referer: https://domain.subdomain.nl/dd/doffers.html?returnUrl=https%3A%2F%2Fttcc.subdomain.nl%2Fdd%2Fpreferences.html%3FValueChanged%3Dfalse&BEGBA=&departureDate=13-06-2013&extChangeTime=&pax2=0&bp=&pax1=1&pax4=0&bk=&pax3=0&shopId=&xtpage=&partner=NSINT&bc=&xt_pc=&ov=&departureTime=&comfortClass=2&destination=DEBHF&thalysTicketless=&beneUser=&debugDOffer=&logonId=&valueChanged=&iDomesticOrigin=&rp=&returnTime=&locale=nl_NL&vu=&thePassWeekend=false&returnDate=&xtsite=&pax=A&lc2=&lc1=&lc4=&lc3=&lc6=&lc5=&BECRA=&passType2=&custId=&lc9=&iDomesticDestination=&passType1=A&lc7=&lc8=&origin=NLASC&toporef=&pid=&passType4=&returnTimeType=1&passType3=&departureTimeType=1&socusId=&idr3=&xtn2=&loyaltyCard=&idr2=&idr1=&thePassBusiness=false&cid=14812
Content-Length: 219
Cookie: subdomainPARTNER=NSINT; JSESSIONID=CB3FEB3AC72AD61A80BFED91D3FD96CA; www-20480=MHFBNLFDFAAA; campaignPos=5; www-47873=MGFBNLFDFAAA; __utma=1.993399624.1370027094.1370040145.1370082133.5; __utmc=1; __utmz=1.1370027094.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); BCSessionID=5dc05787-c2c8-43e1-9abe-93989970b087; BCPermissionLevel=PERSONAL; __utmb=1.1.10.1370082133
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
AJAXREQUEST=_viewRoot&doffersForm=doffersForm&doffersForm%3AvalueChanged=&doffersForm%3ArequestValid=true&javax.faces.ViewState=j_id3&doffersForm%3Aj_id937=doffersForm%3Aj_id937&valueChanged=false&AJAX%3AEVENTS_COUNT=1&

I would like to use:

$headers   = array();
$headers[] = 'Cookie: ' . $cookie;

and:

curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);

where:

$cookie = 'subdomainPARTNER=NSINT; JSESSIONID=CB3FEB3AC72AD61A80BFED91D3FD96CA; www-20480=MHFBNLFDFAAA; campaignPos=5; www-47873=MGFBNLFDFAAA; __utma=1.993399624.1370027094.1370040145.1370082133.5; __utmc=1; __utmz=1.1370027094.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); BCSessionID=5dc05787-c2c8-43e1-9abe-93989970b087; BCPermissionLevel=PERSONAL; __utmb=1.1.10.1370082133';

Some of the parameters in the cookie above I might be able to scrape from the content of the website, but not all. Some of them I might be able to read from the $ckfile, but I don't know how to do that. Especially the utma utmc, utmz, utmcsr, utmccn, utmcmd I am not able to get from anywhere, I think these are generated by the javascript.

Question 1: Am I doing something wrong with the cookie handling in the current code as very few cookie variables are sent by php curl and a lot more by the browser? Further: can other differences between sent headers by browser and php curl be a problem to return the right content?

Question 2: Are the missing cookie variables due to the javascript setting those cookies?

Question 3: What is the best way to handle the cookies to make sure that all required cookies are being sent to the remote server?

Your help is very welcome!

This question is related to php cookies curl setcookie

The answer is


If the cookie is generated from script, then you can send the cookie manually along with the cookie from the file(using cookie-file option). For example:

# sending manually set cookie
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Cookie: test=cookie"));

# sending cookies from file
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile);

In this case curl will send your defined cookie along with the cookies from the file.

If the cookie is generated through javascrript, then you have to trace it out how its generated and then you can send it using the above method(through http-header).

The utma utmc, utmz are seen when cookies are sent from Mozilla. You shouldn't bet worry about these things anymore.

Finally, the way you are doing is alright. Just make sure you are using absolute path for the file names(i.e. /var/dir/cookie.txt) instead of relative one.

Always enable the verbose mode when working with curl. It will help you a lot on tracing the requests. Also it will save lot of your times.

curl_setopt($ch, CURLOPT_VERBOSE, true);

I think the only cookie you need is JSESSIONID=xxx..

Also NEVER share your cookies, becasuse someone may access your personal data that way. Specially when the cookies are session. These cookies will stop working once you logout the site.


Here is a list of examples for sending cookies - https://github.com/andriichuk/php-curl-cookbook#cookies

$curlHandler = curl_init();

curl_setopt_array($curlHandler, [
CURLOPT_URL => 'https://httpbin.org/cookies',
CURLOPT_RETURNTRANSFER => true,

CURLOPT_COOKIEFILE  => $cookieFile,
CURLOPT_COOKIE => 'foo=bar;baz=foo',

/**
 * Or set header
 * CURLOPT_HTTPHEADER => [
       'Cookie: foo=bar;baz=foo',
   ]
 */
]);

$response = curl_exec($curlHandler);
curl_close($curlHandler);

echo $response;

Try below code,

$cookieFile = "cookies.txt";
if(!file_exists($cookieFile)) {
    $fh = fopen($cookieFile, "w");
    fwrite($fh, "");
    fclose($fh);
}


$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $apiCall);
curl_setopt($ch, CURLOPT_POST, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $jsonDataEncoded);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile); // Cookie aware
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile); // Cookie aware
curl_setopt($ch, CURLOPT_VERBOSE, true);
if(!curl_exec($ch)){
    die('Error: "' . curl_error($ch) . '" - Code: ' . curl_errno($ch));
}
else{
    $response = curl_exec($ch); 
}
curl_close($ch);
$result = json_decode($response, true);

echo '<pre>';
var_dump($result);
echo'</pre>';

I hope this will help you.

Best regards, Dasitha.


Examples related to php

I am receiving warning in Facebook Application using PHP SDK Pass PDO prepared statement to variables Parse error: syntax error, unexpected [ Preg_match backtrack error Removing "http://" from a string How do I hide the PHP explode delimiter from submitted form results? Problems with installation of Google App Engine SDK for php in OS X Laravel 4 with Sentry 2 add user to a group on Registration php & mysql query not echoing in html with tags? How do I show a message in the foreach loop?

Examples related to cookies

SameSite warning Chrome 77 How to fix "set SameSite cookie to none" warning? Set cookies for cross origin requests Make Axios send cookies in its requests automatically How can I set a cookie in react? Fetch API with Cookie How to use cookies in Python Requests How to set cookies in laravel 5 independently inside controller Where does Chrome store cookies? Sending cookies with postman

Examples related to curl

What is the incentive for curl to release the library for free? curl: (35) error:1408F10B:SSL routines:ssl3_get_record:wrong version number Converting a POSTMAN request to Curl git clone error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054 How to post raw body data with curl? Curl : connection refused How to use the curl command in PowerShell? Curl to return http status code along with the response How to install php-curl in Ubuntu 16.04 curl: (35) SSL connect error

Examples related to setcookie

How can I send cookies using PHP curl in addition to CURLOPT_COOKIEFILE? Check if a PHP cookie exists and if not set its value Cookies on localhost with explicit domain