[google-api] What are the alternatives now that the Google web search API has been deprecated?

Google Web Search API has been deprecated and replaced with Custom Search API (see http://code.google.com/apis/websearch/).

I wanted to search the whole web but it looks like with the new API only custom sites can be searched.

Is there a way to search the whole web programmatically? I was able to query the old API using JSON from a Java program.

The answer is


There's a free Java API called JFreeWebSearch which uses the already mentioned Faroo: http://www.ke.tu-darmstadt.de/resources/jfreewebsearch


Google Custom Search (as advocated in the top rated answers) works well, but is very expensive, compared to its competitors (below) or compared to other Google API's. It has a small free tier (100 queries/day) and a very high price of $5 per 1000 query.

They offer the option to upgrade to Site Search, which has slightly better prices, but that is meant for searching one site (your own), so it is really something quite different - not an upgrade.

The main alternatives seem to be:

Bing Search API
https://datamarket.azure.com/dataset/5BA839F1-12CE-4CCE-BF57-A49D98D29A44
Which has a free tier of 5000q/month, and prices starting at 5 query per penny, and no hard limit.

UPDATE: At the end of 2016 this API was shutdown in favour of its Azure counterpart "Cognitive Services Bing Search API":
https://azure.microsoft.com/en-us/services/cognitive-services/search/

See here for a pricing chart, which starts at US$3/m for 1,000 transactions. Unless I'm missing something it is quite expensive.

Yahoo BOSS Search API
UPDATE: Was discontinued on March 31, 2016. http://developer.yahoo.com/boss/search/
With prices starting at about 12 queries/penny for whole web searches.

And some I haven't heard of before:

http://www.gigablast.com/searchfeed.html

http://www.faroo.com/hp/api/api.html

http://www.commoncrawl.org/

http://www.entireweb.com/search_api/implementation/
[discontinued - as pointed out below]

There is a bit of discussion of some of these on this SO post.
[got closed for being off-topic and is now gone]


You can create "everywhere" custom search engine right from the Google Custom Search homepage ( http://www.google.com/cse/ ). You should just click 'advanced', during adding new engine. There you can provide Schema.org site type. 'Thing' is most generic type, which covers all the web.


There's a note on top of the docs:

Note: The Google Web Search API has been officially deprecated as of November 1, 2010. It will continue to work as per our deprecation policy, but the number of requests you may make per day will be limited. Therefore, we encourage you to move to the new Custom Search API.

The deprecation policy says that they will continue to run the API for 3 years. So if you already have an application that uses the old API, you don't have to rush to change things just yet. If you're writing a new application, use the Custom Search API. See my answer here for how to do this in Python, but the idea's the same for any language.


Yes, Google Custom Search has now replaced the old Search API, but you can still use Google Custom Search to search the entire web, although the steps are not obvious from the Custom Search setup.

To create a Google Custom Search engine that searches the entire web:

  1. From the Google Custom Search homepage ( http://www.google.com/cse/ ), click Create a Custom Search Engine.
  2. Type a name and description for your search engine.
  3. Under Define your search engine, in the Sites to Search box, enter at least one valid URL (For now, just put www.anyurl.com to get past this screen. More on this later ).
  4. Select the CSE edition you want and accept the Terms of Service, then click Next. Select the layout option you want, and then click Next.
  5. Click any of the links under the Next steps section to navigate to your Control panel.
  6. In the left-hand menu, under Control Panel, click Basics.
  7. In the Search Preferences section, select Search the entire web but emphasize included sites.
  8. Click Save Changes.
  9. In the left-hand menu, under Control Panel, click Sites.
  10. Delete the site you entered during the initial setup process.

Now your custom search engine will search the entire web.

Pricing

  • Google Custom Search gives you 100 queries per day for free.
  • After that you pay $5 per 1000 queries.
  • There is a maximum of 10,000 queries per day.

Source: https://developers.google.com/custom-search/json-api/v1/overview#Pricing


  • The search quality is much lower than normal Google search (no synonyms, "intelligence" etc.)
  • It seems that Google is even planning to shut down this service completely.

I have just come across this from Common Crawl.

http://www.commoncrawl.org/

Might be the answer we are all looking for!!


Here is an option at the bottom of the Custom Search Control Panel: "Sites to search", you can choose "Search the entire web but emphasize included sites"

Custom Search Control Panel - Sites to search


Gigablast offers a cheap web search API: http://www.gigablast.com/searchfeed.html


Faroo has a free Web Search API


Examples related to google-api

Google API authentication: Not valid origin for the client Using Postman to access OAuth 2.0 Google APIs How can I validate google reCAPTCHA v2 using javascript/jQuery? This IP, site or mobile application is not authorized to use this API key Is there a Google Keep API? OAuth2 and Google API: access token expiration time? invalid_grant trying to get oAuth token from google Alternative to google finance api How do I access (read, write) Google Sheets spreadsheets with Python? How to refresh token with Google API client?

Examples related to deprecated

Html.fromHtml deprecated in Android N Is `shouldOverrideUrlLoading` really deprecated? What can I use instead? getResources().getColor() is deprecated ActionBarActivity is deprecated Deprecated: mysql_connect() Replacement for deprecated sizeWithFont: in iOS 7? jQuery 1.9 .live() is not a function The mysql extension is deprecated and will be removed in the future: use mysqli or PDO instead How to declare or mark a Java method as deprecated? Is the buildSessionFactory() Configuration method deprecated in Hibernate What is ' and why does Google search replace it with apostrophe? How to find Google's IP address? How can I add a Google search box to my website? How do search engines deal with AngularJS applications? How to screenshot website in JavaScript client-side / how Google did it? (no need to access HDD) How can I use a search engine to search for special characters? What are the alternatives now that the Google web search API has been deprecated? What database does Google use? What are the alternatives now that the Google web search API has been deprecated?