Simultaneuos HTTP requests in PHP with cURL

February 19th, 2008. Tagged: mashup, performance, php, yahoo, ydn

The basic idea of a Web 2.0-style "mashup" is that you consume data from several services, often from different providers and combine them in interesting ways. This means you often need to do more than one HTTP request to a service or services. In PHP if you use something like file_get_contents() this means all the requests will be synchronous: a new one is fired only after the previous has completed. If you need to make three HTTP requests and each call takes a second, your app is delayed at least three seconds.

Solution

An improvement of course is to cache responses as much as possible, but at one point or another you still need to make those requests.

Using the curl_multi* family of cURL functions you can make those requests simultaneously. This way your app is as slow as the slowest request, as opposed to the sum of all requests. And that's something.

A function

Here's a little function I coded that will allow you do multi requests.

<?php
 
function multiRequest($data, $options = array()) {
 
  // array of curl handles
  $curly = array();
  // data to be returned
  $result = array();
 
  // multi handle
  $mh = curl_multi_init();
 
  // loop through $data and create curl handles
  // then add them to the multi-handle
  foreach ($data as $id => $d) {
 
    $curly[$id] = curl_init();
 
    $url = (is_array($d) && !empty($d['url'])) ? $d['url'] : $d;
    curl_setopt($curly[$id], CURLOPT_URL,            $url);
    curl_setopt($curly[$id], CURLOPT_HEADER,         0);
    curl_setopt($curly[$id], CURLOPT_RETURNTRANSFER, 1);
 
    // post?
    if (is_array($d)) {
      if (!empty($d['post'])) {
        curl_setopt($curly[$id], CURLOPT_POST,       1);
        curl_setopt($curly[$id], CURLOPT_POSTFIELDS, $d['post']);
      }
    }
 
    // extra options?
    if (!empty($options)) {
      curl_setopt_array($curly[$id], $options);
    }
 
    curl_multi_add_handle($mh, $curly[$id]);
  }
 
  // execute the handles
  $running = null;
  do {
    curl_multi_exec($mh, $running);
  } while($running > 0);
 
 
  // get content and remove handles
  foreach($curly as $id => $c) {
    $result[$id] = curl_multi_getcontent($c);
    curl_multi_remove_handle($mh, $c);
  }
 
  // all done
  curl_multi_close($mh);
 
  return $result;
}
 
?>

Consuming

The function accepts an array of URLs to hit and optionally an array of cURL options if you need to pass any. The first array can be a simple indexed array or URLs or it can be an array of arrays where the second has a key named "url". If you use the second way and you also have a key called "post", the function will do a post request.

The function returns an array of responses as strings. The keys in the result array match the keys in the input.

A GET example

Let's say you want to use some Yahoo search web services (consult YDN) to create a music artist band-o-pedia kind of mashup. Here's how you can search audio, video and images at the same time:

<?php
 
$data = array(
  'http://search.yahooapis.com/VideoSearchService/V1/videoSearch?appid=YahooDemo&query=Pearl+Jam&output=json',
  'http://search.yahooapis.com/ImageSearchService/V1/imageSearch?appid=YahooDemo&query=Pearl+Jam&output=json',
  'http://search.yahooapis.com/AudioSearchService/V1/artistSearch?appid=YahooDemo&artist=Pearl+Jam&output=json'
);
$r = multiRequest($data);
 
echo '<pre>';
print_r($r);
 
?>

This will print something like:

Array
(
    [0] => {"ResultSet":{"totalResultsAvailable":"633","totalResultsReturned":...
    [1] => {"ResultSet":{"totalResultsAvailable":"105342","totalResultsReturned":...
    [2] => {"ResultSet":{"totalResultsAvailable":10,"totalResultsReturned":...
)

A POST example

There's an interesting Yahoo search service called term extraction which analyses content. It accepts POST requests. Here's how to consume this service with the function above, making two simultaneous requests:

<?php
$data = array(array(),array());
 
$data[0]['url']  = 'http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction';
$data[0]['post'] = array();
$data[0]['post']['appid']   = 'YahooDemo';
$data[0]['post']['output']  = 'php';
$data[0]['post']['context'] = 'Now I lay me down to sleep,
                               I pray the Lord my soul to keep;
                               And if I die before I wake,
                               I pray the Lord my soul to take.';
 
 
$data[1]['url']  = 'http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction';
$data[1]['post'] = array();
$data[1]['post']['appid']   = 'YahooDemo';
$data[1]['post']['output']  = 'php';
$data[1]['post']['context'] = 'Now I lay me down to sleep,
                               I pray the funk will make me freak;
                               If I should die before I waked,
                               Allow me Lord to rock out naked.';
 
$r = multiRequest($data);
 
print_r($r);
?>

And the result:

Array
(
    [0] => a:1:{s:9:"ResultSet";a:1:{s:6:"Result";s:5:"sleep";}}
    [1] => a:1:{s:9:"ResultSet";a:1:{s:6:"Result";a:3:{i:0;s:5:"freak";i:1;s:5:"sleep";i:2;s:4:"funk";}}}
)

Tell your friends about this post: Facebook, Twitter, Google+

73 Responses

  1. [...] wrote an interesting post today onHere’s a quick excerptIn PHP if you use something like file_get_contents() this means all the requests will be synchronous: a new one is fired only after the previous has completed. If you need to make three HTTP requests and each call takes a second, … [...]

  2. I just use Yahoo! Pipes to pull all my feeds in parallel and give me a single json feed of the processed data. Remind me to write a blog post about it sometime.

    Also, ask me tomorrow about another way to do federated queries.

  3. Stoyan Stefanov’s Blog: Simultaneous HTTP requests in PHP with cURL…

    On his blog today, Stoyan Stefanov has a howto posted on ……

  4. [...] On his blog today, Stoyan Stefanov has a howto posted on a trick he figured out to get a PHP script to grab data from multiple resources at one time – with cURL. The basic idea of a Web 2.0-style “mashup” is that you consume data from several services, often from different providers and combine them in interesting ways. This means you often need to do more than one HTTP request to a service or services. […] Using the curl_multi* family of cURL functions you can make those requests simultaneously. This way your app is as slow as the slowest request, as opposed to the sum of all requests. And that’s something. [...]

  5. [...] Stoyan Stefanov wrote a nice article with example code showing just how to accomplish thi.  I cannot wait to give it a shot and boost the performance of my apps.  Thanks for pointing this out Stoyan. Websites, Useful, Tips & Tricks, News, Raves, Entertainment, PHP. [...]

  6. It would be great trio with SimpleXML and Google Geo API

  7. AFAIK it’s possible to make async requests without libcurl: http://php.net/stream_socket_client

  8. This article is like gold dust to me. I’m working on an app that is severly limited by curl’s synchronous requests. I’ve had to implement forking to improve speed/scalability. I will be trying this method soon.

    Any idea if there’s something similiar for soap requests? If I can crack that i’ll be laughing.

  9. @author

    Thanks for bringing attention to this very powerful set of functions- it news to me!

    @James

    SOAP is merely an http request, so of course you can do it with the cURL-multi. Now, can you use PHP 5′s built-in SOAP extensions? I doubt it. But just use an existing SOAP lib (there’s several good ones) or create your own, it really isn’t that hard. I had to make one for PHP4 awhile back and it was fun.

  10. [...] simultaneous http requests in php with curl [...]

  11. Nice article, I going to translate it to russian, if you don’t mind. I think you have a misprint: Simultaneuos -> Simultaneous.

    And offtopic: I’ve start a blog-game, and I want to pass on the baton to you. The thing is you need to write about five things/tools, that you can’t work with out it properly. For example it maybe favourite pen, Firefox or MacOS. And next you need to push the baton to 5 blogs from your blogroll.

  12. [...] http://www.phpied.com/simultaneuos-http-requests-in-php-with-curl/ [...]

  13. Hey Stoyan! Very nice writeup. I didn’t know about these capabilities of curl! Going to try this soon. Thanks :)

  14. [...] [...]

  15. This is EXACTLY what I needed. Thanks! Been pulling my hair out trying to increase the performance of my app. Is there anyway to tie an ID to each request? So that I can track the requests based upon an arbitrary ID as opposed to just the array key?

  16. At last someone did it! We used python for these tasks, but now I can use my lovely PHP for this! Yes!!! Thanks!

  17. Thanks very much for this – I was having trouble getting my head around this, much clearer now. The project I’m sketching up now will definitely benefit from making requests simultaneously…

    Cheers, Andi

  18. [...] unknown wrote an interesting post today onHere’s a quick excerpt… that you consume data from several services, often from different providers and combine them in interesting ways. This means you often need to do more than one HTTP request to a service or services. In PHP if you use something like … [...]

  19. Works perfectly thanks!

  20. [...] I recently completed a project that required moving small amounts of data to different servers. Easy enough. The only complication was that it needed to make around 500 separate requests a minute (and growing), to different servers, with different data. Unfortunately cURL typically makes it requests in a synchronous fashion–the next event doesn’t fire until the previous one has completed. When you’re running 500 a minute, every minute, and you have to wait for the previous event to finish, you immediately start building a stockpile of requests–which is bad. Very bad. Every minute the queue grows bigger, and the performance becomes worse. However, cURL has a built in group of functions called curl_multi which allows you to send all of your requests simulatenously. It reduced the processing time so dramatically, that my software can now easily send the 500 requests in under 10 seconds. To that end, I ended up rewriting a function I found over at  phpied.com to handle a variety of different scenarios involving post and get parameters. Most of the documentation you need can be found in the example.php file where I included a bunch of different case scenarios. [...]

  21. Great tip, I am going to use it on my spiders on wordsfinder.com … I have got a few intelligence tools that might like your findings.

  22. [...] http://www.phpied.com/simultaneuos-http-requests-in-php-with-curl/ [...]

  23. [...] here to go to his post Tags: curl, multiple requests, phpRead More Comments (0)php [...]

  24. Great job,

    I was having CURL timeout issue while making http requests to more than 10-15 different services and it was taking too long to get the responses form all services. Initially i thought to create child processes and treat them as thread (but php has no support for threading in its core so we have to rely on the system function to create processes) but this article solved my problem as i didn’t know CURL support for simultaneous http requests. But now i can use this approach to solve the timeout and child processes issue.

    Thanks and once again nice Job.

  25. Just a note on the asynchronous nature of cURL’s multi functions: the DNS lookups are not (as far as I know today) asynchronous. So if one DNS lookup of your group fails, everything in the list of URLs after that fails also. We actually update our hosts.conf (I think?) file on our server daily in order to get around this. It gets the IP addresses there instead of looking them up. I believe it’s being worked on, but not sure if it’s changed in cURL yet.

  26. Thanks you for this example of multi curl …

  27. Could you place your code into a downloadable zip’d file? I like the way your website works with these comments. Nice job.

  28. Is it possible to show the results one by one on the website. Cause now firefox just load the page and I have to wait till all pages are collected to get the results and I want to show ‘please wait…’ command, then when results are collected they are inserted one by one, and when finish ‘thank you!’ label is shown. Is it possible to do such thing with these scripts? Or I have to use AJAX or AHAH technology? Thanks, Mark

  29. [...] posted by iwatani on 10 月 17 « Google&Yahoo&Amazon一括検索 先日、マッシュアップに必須!PHPで複数APIを同時に叩いて超高速化するサンプル という記事を発見しました。 概要としては 通常APIを複数利用するときなどはAPIにリクエストし、結果が返ってきたら次のAPIにリスエストを行いますが multiRequest関数を利用すると複数のAPIに同時にリクエストを行うため高速化されるという事です。 早速、試しに便利ツールTipsにあるRank Checker(Google|Yahoo|MSN|Baidu)を改良して実験してみます。今回は開発環境でsymfonyのレンダリングを使って計測します。 Rank Checker(Google|Yahoo|MSN|Baidu)はそれぞれの検索エンジンで検索結果を取得し、URLの順位を拾って表示しています。上位から順に確認して検索結果でURLが見つかった時点で次の動作に移るようにしてあるので検索結果にURLが見つからない場合が一番処理が長くなってしまいます。 例えば キーワード 三愛企画 URL http://www.iii-planning.com/ をチェックしてみると2281msですが キーワード ネイル URL http://www.iii-planning.com/ だと29806msもかかってしまいます。 This entry was posted on 金曜日, 10 月 17th, 2008 at 11:28 AM and is filed under 未分類. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site. [...]

  30. Thanks for your tip. I would like to translate to Vietnamese if you don’t mind. I made a pingback.

  31. I usually don’t comment but this one deserves a thumb up. You’ve done a great job explaining this curl_multi* family as you name it.

  32. [...] that will get some information from e.g. 5 sites simultaneously. I tried the following script: http://www.phpied.com/simultaneuos-h…php-with-curl/ Everything works fine, but what I want is simultaneuos (something to multithread, when these 5 [...]

  33. works perfect thanks…

  34. Thanks for sharing this example. I made some modifications so that you can process each request as soon as it completes. It makes things a lot faster when you’re dealing with a large number of requests:

    http://onlineaspect.com/2009/01/26/how-to-use-curl_multi-without-blocking/

  35. // execute the handles
    $running = null;
    do {
    curl_multi_exec($mh, $running);
    } while($running > 0);

    Don’t use it without some little delay!
    Better way – insert inside cycle:
    usleep(25000);

  36. I had a problem before understood that such web requests to APIs must be simultaneous. It sometimes takes up to 3 minutes for my app to get and parse data from remote server. Hopefully this function will help to significantly decrease page load time. Thank you for solution!

  37. [...] This post was Twitted by phpprofession [...]

  38. [...] by making it very fast and easy to browse the code documentation along with the code itself. Simultaneuos HTTP requests in PHP with cURL / phpied.com Mehrere HTTP Request parallel mit cURL in PHP? Hier wird beschrieben wie! Share this on [...]

  39. I don’t understand the loop. Why would you want to loop curl_multi_exec()? It only needs to be run once… What am I missing? Thanks.

    // execute the handles
    $running = null;
    do {
    curl_multi_exec($mh, $running);
    usleep(25000);
    } while($running > 0);

  40. Explanation of this code:
    // execute the handles
    $running = null;
    do {
    curl_multi_exec($mh, $running);
    usleep(25000);
    } while($running > 0);

    1. curl_multi_exec() needs to be called multiple times. Every time it is called the handles are proceeded a little bit. There is also a return value wich indicates if there happend something like a request ended.

    2. The usleep is important! As you can see it is a short do while loop. In case we just use it as it is, this loop will run very quick and very often because our requests might take several seconds to complete. This will produce high CPU load on the server and is of course not what we want.

  41. After unsuccessful attempts to use the stream_ functions in PHP, you’ve given me the solution I needed. Thank you!

  42. Thanks for this post it has been a big help to me in speeding up one of my scripts that runs for a really long time.

  43. I have the same problem as Mike above, in that I don’t understand why we need to loop the curl_multi_exec() function.

    In my code, I time the execution of the loop, and it takes roughly n times the execution of one request. If I have 1 query to do, it takes x seconds, 2 queries 2x seconds, n requests nx seconds. This implies sequential execution.

    I am very confused … :(

    Also, what does it mean “Every time it is called the handles are proceeded a little bit.” ?

    mycode:
    $timeStart = microtime(true);
    $running = null;
    do {
    curl_multi_exec($mh, $running);
    usleep(25000);
    } while($running > 0);
    $timeEnd = microtime(true);
    echo $timeEnd-$timeStart;

  44. As far as i know,
    i need to loop the multi cURL to find which thread is finished doing the process and then close it.
    I don’t understand why he need to loop the curl_multi_exec either.

  45. [...] Stoyan’s Multi Curl function you already realise quite an increase in [...]

  46. This is awesome tutorial and method. Its working perfectly and solved my problem. Thank you very much developer ……………….. :)

  47. Highly nice informations, I found out your blog site on google and register a couple. of of one’s other posts. I just added you to my Google News Reader. Maintain up the fantabulous mapping Look forward to reading more from you inside the future….

  48. 35

  49. +1 for usleep(25000);

  50. Perfect…. I cut, I paste, It works. Thanks SOOO much for the post. My boss loves me (and you).

  51. Very nice example of CURL multi url in a nice easy to read function. tHanks be many with this one!

  52. [...] Simultaneuos HTTP requests in PHP with cURL / Stoyan’s phpied.comThis is the nicest article I've found on the subject of using multi curl to do asynchronous requests  [...]

  53. wow, thanks for this. This blog post saved my butt today. Got it to around 1 sec or more, I’m seeing if there are other optimizations to be had. good stuff keep up the good work!

  54. Awesome! This is just what I was looking for… This article should be linked to from php.net’s explanation of curl_multi_exec

  55. Resorts in Kerala…

    [...]Simultaneuos HTTP requests in PHP with cURL / Stoyan’s phpied.com[...]…

  56. Servers…

    [...]Simultaneuos HTTP requests in PHP with cURL / Stoyan’s phpied.com[...]…

  57. Draw Something…

    [...]Simultaneuos HTTP requests in PHP with cURL / Stoyan’s phpied.com[...]…

  58. software,download software,dj software,video editing software…

    [...]Simultaneuos HTTP requests in PHP with cURL / Stoyan’s phpied.com[...]…

  59. This paragraph offers clear idea in support of the new visitors
    of blogging, that actually how to do running a blog.

  60. [...] because each request was being sent one at a time. I did some searching and came across this post-> http://www.phpied.com/simultaneuos-http-requests-in-php-with-curl/ which explains how to perform simultaneous HTTP requests in PHP with [...]

  61. [...] each request was being sent one at a time. I did some searching and came across this post-> http://www.phpied.com/simultaneuos-http-requests-in-php-with-curl/ which explains how to perform simultaneous HTTP requests in PHP with [...]

  62. Just want to say thanks! Helped me very much! Both examples were very helpful!

  63. [...] I’ve done some research and come across this page which details simultaneous cURL requests -> http://www.phpied.com/simultaneuos-http-requests-in-php-with-curl/ [...]

  64. Works great, thanks for sharing!

  65. [...] Stefanov wrote a nice article with example code showing just how to accomplish thi.  I cannot wait to give it a shot and [...]

  66. Hi,

    When I use the this code to send multiple XML to miltiple urls, and set the data as array,
    the CURL post the xml correclty, but adds Content-disposition HEADER. That means CURL send xml as attached file, and not as text/xml. Even if I set the correct content-type header. I thing the problem is beacuse the XMLs are saved in array $data[].

    Is any way to send it as simple XML without the content-disposition HEADER?

    Thanks in advance

    Liacko

  67. This is very attention-grabbing, You are an excessively skilled blogger. I have joined your feed and look forward to in quest of extra of your excellent post. Also, I have shared your web site in my social networks

  68. Hello there, You have done an excellent job. I will definitely digg it and in my opinion suggest to my friends. I’m confident they will be benefited from this web site.

  69. This is great although I cannot get it to post. Would you be willing to look at my code and tell me where I am missing a step?

  70. This is great. Curling 8 URLs was taking around 8 secs. multi_curl took this down to an average of 1.6 seconds.

  71. Thanks !
    It takes a lot of time to get results of about 80 pages without simultaneous requests, great job !!

  72. thanks mas bro,
    this tutorial totally help me with my project

  73. […] and curl_multi_getcontent will let you get the results of each of the API call. Just read here to see a code example which implements the described […]

Leave a Reply