How to get a web page s source code from Java

Question

I just want to retrieve any web page s source code from Java  I found lots of solutions so far  but I couldn t find any code that works for all the links below     http   www cumhuriyet com tr hn 298710 http   www fotomac com tr Yazarlar Olcay 20 C3 87ak C4 B1r 2011 11 23 hesap-makinesi  http   www sabah com tr Gundem 2011 12 23 basbakan-konferansta-konusuyor    The main problem for me is that some codes retrieve web page source code  but with missing ones  For example the code below does not work for the first link   InputStream is   fURL openStream      fURL can be one of the links above BufferedReader buffer   null  buffer   new BufferedReader new InputStreamReader is   iso-8859-9      int byteRead  while   byteRead   buffer read       -1        builder append  char  byteRead     buffer close    System out println builder toString

User · Answer

Try the following code with an added request property   import java io BufferedReader  import java io IOException  import java io InputStream  import java io InputStreamReader  import java net URL  import java net URLConnection   public class SocketConnection       public static String getURLSource String url  throws IOException               URL urlObject   new URL url           URLConnection urlConnection   urlObject openConnection            urlConnection setRequestProperty  User-Agent    Mozilla 5 0  Windows NT 6 1  WOW64  AppleWebKit 537 11  KHTML  like Gecko  Chrome 23 0 1271 95 Safari 537 11             return toString urlConnection getInputStream                private static String toString InputStream inputStream  throws IOException               try  BufferedReader bufferedReader   new BufferedReader new InputStreamReader inputStream   UTF-8                           String inputLine              StringBuilder stringBuilder   new StringBuilder                while   inputLine   bufferedReader readLine       null                                stringBuilder append inputLine                              return stringBuilder toString

User · Answer

I am sure that you have found a solution somewhere over the past 2 years but the following is a solution that works for your requested site  package javasandbox   import java io BufferedReader  import java io IOException  import java io InputStreamReader  import java net HttpURLConnection  import java net MalformedURLException  import java net URL            author Ryan Oglesby    public class JavaSandbox    private static String sURL           param args the command line arguments     public static void main String   args  throws MalformedURLException  IOException       sURL    http   www cumhuriyet com tr  hn 298710       System out println sURL       URL url   new URL sURL       HttpURLConnection httpCon    HttpURLConnection  url openConnection          set http request headers             httpCon addRequestProperty  Host    www cumhuriyet com tr                httpCon addRequestProperty  Connection    keep-alive                httpCon addRequestProperty  Cache-Control    max-age 0                httpCon addRequestProperty  Accept    text html application xhtml xml application xml q 0 9 image webp     q 0 8                httpCon addRequestProperty  User-Agent    Mozilla 5 0  Windows NT 6 1  WOW64  AppleWebKit 537 36  KHTML  like Gecko  Chrome 30 0 1599 101 Safari 537 36                httpCon addRequestProperty  Accept-Encoding    gzip deflate sdch                httpCon addRequestProperty  Accept-Language    en-US en q 0 8                  httpCon addRequestProperty  Cookie    JSESSIONID EC0F373FCC023CD3B8B9C1E2E2F7606C  lang tr    utma 169322547 1217782332 1386173665 1386173665 1386173665 1    utmb 169322547 1 10 1386173665    utmc 169322547    utmz 169322547 1386173665 1 1 utmcsr stackoverflow com utmccn  referral  utmcmd referral utmcct  questions 8616781 how-to-get-a-web-pages-source-code-from-java    gads ID 3ab4e50d8713e391 T 1386173664 S ALNI Mb8N wW0xS wRa68vhR0gTRl8MwFA  scrElm body                HttpURLConnection setFollowRedirects false               httpCon setInstanceFollowRedirects false               httpCon setDoOutput true               httpCon setUseCaches true                httpCon setRequestMethod  GET                 BufferedReader in   new BufferedReader new InputStreamReader httpCon getInputStream     UTF-8                 String inputLine              StringBuilder a   new StringBuilder                while   inputLine   in readLine       null                  a append inputLine               in close                 System out println a toString                  httpCon disconnect

User · Answer

URL yahoo   new URL  http   www yahoo com     BufferedReader in   new BufferedReader              new InputStreamReader              yahoo openStream       String inputLine   while   inputLine   in readLine       null      System out println inputLine    in close

[java] How to get a web page's source code from Java

Examples related to java

Examples related to web

Examples related to web-crawler

Examples related to web-content