[java] Direct download from Google Drive using Google Drive API

My desktop application, written in java, tries to download public files from Google Drive. As i found out, it can be implemented by using file's webContentLink (it's for ability to download public files without user authorization).

So, the code below works with small files:

String webContentLink = aFile.getWebContentLink();
InputStream in = new URL(webContentLink).openStream();

But it doesn't work on big files, because in this case file can't be downloaded directly via webContentLink without user confirmation with google virus scan warning. See an example: web content link.

So my question is how to get content of a public file from Google Drive without user authorization?

This question is related to java google-drive-api

The answer is


https://drive.google.com/uc?export=download&id=FILE_ID replace the FILE_ID with file id.

if you don't know were is file id then check this article Article LINK


If you just want to programmatically (as oppossed to giving the user a link to open in a browser) download a file through the Google Drive API, I would suggest using the downloadUrl of the file instead of the webContentLink, as documented here: https://developers.google.com/drive/web/manage-downloads


#Case 1: download file with small size.

#Case 2: download file with large size.

  • You stuck a wall of a virus scan alert page returned. By parsing html dom element, I tried to get link with confirm code under button "Download anyway" but it didn't work. Its may required cookie or session info. enter image description here

SOLUTION:

  • Finally I found solution for two above cases. Just need to put httpConnection.setDoOutput(true) in connection step to get a Json.

    )]}' { "disposition":"SCAN_CLEAN", "downloadUrl":"http:www...", "fileName":"exam_list_json.txt", "scanResult":"OK", "sizeBytes":2392}

Then, you can use any Json parser to read downloadUrl, fileName and sizeBytes.

  • You can refer follow snippet, hope it help.

    private InputStream gConnect(String remoteFile) throws IOException{
        URL  url = new URL(remoteFile);
        URLConnection connection = url.openConnection();
        if(connection instanceof HttpURLConnection){
            HttpURLConnection httpConnection = (HttpURLConnection) connection;
            connection.setAllowUserInteraction(false);
            httpConnection.setInstanceFollowRedirects(true);
            httpConnection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows 2000)");
            httpConnection.setDoOutput(true);          
            httpConnection.setRequestMethod("GET");
            httpConnection.connect();
    
            int reqCode = httpConnection.getResponseCode();
    
    
            if(reqCode == HttpURLConnection.HTTP_OK){
                InputStream is = httpConnection.getInputStream();
                Map<String, List<String>> map = httpConnection.getHeaderFields();
                List<String> values = map.get("content-type");
                if(values != null && !values.isEmpty()){
                    String type = values.get(0);
    
                    if(type.contains("text/html")){
                        String cookie = httpConnection.getHeaderField("Set-Cookie");
                        String temp = Constants.getPath(mContext, Constants.PATH_TEMP) + "/temp.html";
                        if(saveGHtmlFile(is, temp)){
                            String href = getRealUrl(temp);
                            if(href != null){
                                return parseUrl(href, cookie);
                            }
                        }
    
    
                    } else if(type.contains("application/json")){
                        String temp = Constants.getPath(mContext, Constants.PATH_TEMP) + "/temp.txt";
                        if(saveGJsonFile(is, temp)){
                            FileDataSet data = JsonReaderHelper.readFileDataset(new File(temp));
                            if(data.getPath() != null){
                                return parseUrl(data.getPath());
                            }
                        }
                    }
                }
                return is;
            }
        }
        return null;
    }
    

And

   public static FileDataSet readFileDataset(File file) throws IOException{
        FileInputStream is = new FileInputStream(file);
        JsonReader reader = new JsonReader(new InputStreamReader(is, "UTF-8"));

        reader.beginObject();
        FileDataSet rs = new FileDataSet();
        while(reader.hasNext()){
            String name = reader.nextName();
            if(name.equals("downloadUrl")){
                rs.setPath(reader.nextString());
            } else if(name.equals("fileName")){
                rs.setName(reader.nextString());
            } else if(name.equals("sizeBytes")){
                rs.setSize(reader.nextLong());
            } else {
                reader.skipValue();
            }
        }
        reader.endObject();
        return rs;

    }

I simply create a javascript so that it automatically capture the link and download and close the tab with the help of tampermonkey.

// ==UserScript==
// @name         Bypass Google drive virus scan
// @namespace    SmartManoj
// @version      0.1
// @description  Quickly get the download link
// @author       SmartManoj
// @match        https://drive.google.com/uc?id=*&export=download*
// @grant        none
// ==/UserScript==

    function sleep(ms) {
      return new Promise(resolve => setTimeout(resolve, ms));
    }

    async function demo() {
        await sleep(5000);
        window.close();
    }

    (function() {
        location.replace(document.getElementById("uc-download-link").href);
        demo();
    })();

Similarly you can get the html source of the url and download in java.


This seems to be updated again as of May 19, 2015:

How I got it to work:

As in jmbertucci's recently updated answer, make your folder public to everyone. This is a bit more complicated than before, you have to click Advanced to change the folder to "On - Public on the web."

Find your folder UUID as before--just go into the folder and find your UUID in the address bar:

https://drive.google.com/drive/folders/<folder UUID>

Then head to

https://googledrive.com/host/<folder UUID>

It will redirect you to an index type page with a giant subdomain, but you should be able to see the files in your folder. Then you can right click to save the link to the file you want (I noticed that this direct link also has this big subdomain for googledrive.com). Worked great for me with wget.

This also seems to work with others' shared folders.

e.g.,

https://drive.google.com/folderview?id=0B7l10Bj_LprhQnpSRkpGMGV2eE0&usp=sharing

maps to

https://googledrive.com/host/0B7l10Bj_LprhQnpSRkpGMGV2eE0

And a right click can save a direct link to any of those files.


https://github.com/google/skicka

I used this command line tool to download files from Google Drive. Just follow the instructions in Getting Started section and you should download files from Google Drive in minutes.


If you face the "This file cannot be checked for viruses" intermezzo page, the download is not that easy.

You essentially need to first download the normal download link, which however redirects you to the "Download anyway" page. You need to store cookies from this first request, find out the link pointed to by the "Download anyway" button, and then use this link to download the file, but reusing the cookies you got from the first request.

Here's a bash variant of the download process using CURL:

curl -c /tmp/cookies "https://drive.google.com/uc?export=download&id=DOCUMENT_ID" > /tmp/intermezzo.html
curl -L -b /tmp/cookies "https://drive.google.com$(cat /tmp/intermezzo.html | grep -Po 'uc-download-link" [^>]* href="\K[^"]*' | sed 's/\&amp;/\&/g')" > FINAL_DOWNLOADED_FILENAME

Notes:

  • this procedure will probably stop working after some Google changes
  • the grep command uses Perl syntax (-P) and the \K "operator" which essentially means "do not include anything preceding \K to the matched result. I don't know which version of grep introduced these options, but ancient or non-Ubuntu versions probably don't have it
  • a Java solution would be more or less the same, just take a HTTPS library which can handle cookies, and some nice text-parsing library

I would consider downloading from the link, scraping the page that you get to grab the confirmation link, and then downloading that.

If you look at the "download anyway" URL it has an extra confirm query parameter with a seemingly randomly generated token. Since it's random...and you probably don't want to figure out how to generate it yourself, scraping might be the easiest way without knowing anything about how the site works.

You may need to consider various scenarios.


I know this is an old question but I could not find a solution to this problem after some research, so I am sharing what worked for me.

I have written this C# code for one of my projects. It can bypass the scan virus warning programmatically. The code can probably be converted to Java.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.IO;
using System.Net;
using System.Text;

public class FileDownloader : IDisposable
{
    private const string GOOGLE_DRIVE_DOMAIN = "drive.google.com";
    private const string GOOGLE_DRIVE_DOMAIN2 = "https://drive.google.com";

    // In the worst case, it is necessary to send 3 download requests to the Drive address
    //   1. an NID cookie is returned instead of a download_warning cookie
    //   2. download_warning cookie returned
    //   3. the actual file is downloaded
    private const int GOOGLE_DRIVE_MAX_DOWNLOAD_ATTEMPT = 3;

    public delegate void DownloadProgressChangedEventHandler( object sender, DownloadProgress progress );

    // Custom download progress reporting (needed for Google Drive)
    public class DownloadProgress
    {
        public long BytesReceived, TotalBytesToReceive;
        public object UserState;

        public int ProgressPercentage
        {
            get
            {
                if( TotalBytesToReceive > 0L )
                    return (int) ( ( (double) BytesReceived / TotalBytesToReceive ) * 100 );

                return 0;
            }
        }
    }

    // Web client that preserves cookies (needed for Google Drive)
    private class CookieAwareWebClient : WebClient
    {
        private class CookieContainer
        {
            private readonly Dictionary<string, string> cookies = new Dictionary<string, string>();

            public string this[Uri address]
            {
                get
                {
                    string cookie;
                    if( cookies.TryGetValue( address.Host, out cookie ) )
                        return cookie;

                    return null;
                }
                set
                {
                    cookies[address.Host] = value;
                }
            }
        }

        private readonly CookieContainer cookies = new CookieContainer();
        public DownloadProgress ContentRangeTarget;

        protected override WebRequest GetWebRequest( Uri address )
        {
            WebRequest request = base.GetWebRequest( address );
            if( request is HttpWebRequest )
            {
                string cookie = cookies[address];
                if( cookie != null )
                    ( (HttpWebRequest) request ).Headers.Set( "cookie", cookie );

                if( ContentRangeTarget != null )
                    ( (HttpWebRequest) request ).AddRange( 0 );
            }

            return request;
        }

        protected override WebResponse GetWebResponse( WebRequest request, IAsyncResult result )
        {
            return ProcessResponse( base.GetWebResponse( request, result ) );
        }

        protected override WebResponse GetWebResponse( WebRequest request )
        {
            return ProcessResponse( base.GetWebResponse( request ) );
        }

        private WebResponse ProcessResponse( WebResponse response )
        {
            string[] cookies = response.Headers.GetValues( "Set-Cookie" );
            if( cookies != null && cookies.Length > 0 )
            {
                int length = 0;
                for( int i = 0; i < cookies.Length; i++ )
                    length += cookies[i].Length;

                StringBuilder cookie = new StringBuilder( length );
                for( int i = 0; i < cookies.Length; i++ )
                    cookie.Append( cookies[i] );

                this.cookies[response.ResponseUri] = cookie.ToString();
            }

            if( ContentRangeTarget != null )
            {
                string[] rangeLengthHeader = response.Headers.GetValues( "Content-Range" );
                if( rangeLengthHeader != null && rangeLengthHeader.Length > 0 )
                {
                    int splitIndex = rangeLengthHeader[0].LastIndexOf( '/' );
                    if( splitIndex >= 0 && splitIndex < rangeLengthHeader[0].Length - 1 )
                    {
                        long length;
                        if( long.TryParse( rangeLengthHeader[0].Substring( splitIndex + 1 ), out length ) )
                            ContentRangeTarget.TotalBytesToReceive = length;
                    }
                }
            }

            return response;
        }
    }

    private readonly CookieAwareWebClient webClient;
    private readonly DownloadProgress downloadProgress;

    private Uri downloadAddress;
    private string downloadPath;

    private bool asyncDownload;
    private object userToken;

    private bool downloadingDriveFile;
    private int driveDownloadAttempt;

    public event DownloadProgressChangedEventHandler DownloadProgressChanged;
    public event AsyncCompletedEventHandler DownloadFileCompleted;

    public FileDownloader()
    {
        webClient = new CookieAwareWebClient();
        webClient.DownloadProgressChanged += DownloadProgressChangedCallback;
        webClient.DownloadFileCompleted += DownloadFileCompletedCallback;

        downloadProgress = new DownloadProgress();
    }

    public void DownloadFile( string address, string fileName )
    {
        DownloadFile( address, fileName, false, null );
    }

    public void DownloadFileAsync( string address, string fileName, object userToken = null )
    {
        DownloadFile( address, fileName, true, userToken );
    }

    private void DownloadFile( string address, string fileName, bool asyncDownload, object userToken )
    {
        downloadingDriveFile = address.StartsWith( GOOGLE_DRIVE_DOMAIN ) || address.StartsWith( GOOGLE_DRIVE_DOMAIN2 );
        if( downloadingDriveFile )
        {
            address = GetGoogleDriveDownloadAddress( address );
            driveDownloadAttempt = 1;

            webClient.ContentRangeTarget = downloadProgress;
        }
        else
            webClient.ContentRangeTarget = null;

        downloadAddress = new Uri( address );
        downloadPath = fileName;

        downloadProgress.TotalBytesToReceive = -1L;
        downloadProgress.UserState = userToken;

        this.asyncDownload = asyncDownload;
        this.userToken = userToken;

        DownloadFileInternal();
    }

    private void DownloadFileInternal()
    {
        if( !asyncDownload )
        {
            webClient.DownloadFile( downloadAddress, downloadPath );

            // This callback isn't triggered for synchronous downloads, manually trigger it
            DownloadFileCompletedCallback( webClient, new AsyncCompletedEventArgs( null, false, null ) );
        }
        else if( userToken == null )
            webClient.DownloadFileAsync( downloadAddress, downloadPath );
        else
            webClient.DownloadFileAsync( downloadAddress, downloadPath, userToken );
    }

    private void DownloadProgressChangedCallback( object sender, DownloadProgressChangedEventArgs e )
    {
        if( DownloadProgressChanged != null )
        {
            downloadProgress.BytesReceived = e.BytesReceived;
            if( e.TotalBytesToReceive > 0L )
                downloadProgress.TotalBytesToReceive = e.TotalBytesToReceive;

            DownloadProgressChanged( this, downloadProgress );
        }
    }

    private void DownloadFileCompletedCallback( object sender, AsyncCompletedEventArgs e )
    {
        if( !downloadingDriveFile )
        {
            if( DownloadFileCompleted != null )
                DownloadFileCompleted( this, e );
        }
        else
        {
            if( driveDownloadAttempt < GOOGLE_DRIVE_MAX_DOWNLOAD_ATTEMPT && !ProcessDriveDownload() )
            {
                // Try downloading the Drive file again
                driveDownloadAttempt++;
                DownloadFileInternal();
            }
            else if( DownloadFileCompleted != null )
                DownloadFileCompleted( this, e );
        }
    }

    // Downloading large files from Google Drive prompts a warning screen and requires manual confirmation
    // Consider that case and try to confirm the download automatically if warning prompt occurs
    // Returns true, if no more download requests are necessary
    private bool ProcessDriveDownload()
    {
        FileInfo downloadedFile = new FileInfo( downloadPath );
        if( downloadedFile == null )
            return true;

        // Confirmation page is around 50KB, shouldn't be larger than 60KB
        if( downloadedFile.Length > 60000L )
            return true;

        // Downloaded file might be the confirmation page, check it
        string content;
        using( var reader = downloadedFile.OpenText() )
        {
            // Confirmation page starts with <!DOCTYPE html>, which can be preceeded by a newline
            char[] header = new char[20];
            int readCount = reader.ReadBlock( header, 0, 20 );
            if( readCount < 20 || !( new string( header ).Contains( "<!DOCTYPE html>" ) ) )
                return true;

            content = reader.ReadToEnd();
        }

        int linkIndex = content.LastIndexOf( "href=\"/uc?" );
        if( linkIndex < 0 )
            return true;

        linkIndex += 6;
        int linkEnd = content.IndexOf( '"', linkIndex );
        if( linkEnd < 0 )
            return true;

        downloadAddress = new Uri( "https://drive.google.com" + content.Substring( linkIndex, linkEnd - linkIndex ).Replace( "&amp;", "&" ) );
        return false;
    }

    // Handles the following formats (links can be preceeded by https://):
    // - drive.google.com/open?id=FILEID
    // - drive.google.com/file/d/FILEID/view?usp=sharing
    // - drive.google.com/uc?id=FILEID&export=download
    private string GetGoogleDriveDownloadAddress( string address )
    {
        int index = address.IndexOf( "id=" );
        int closingIndex;
        if( index > 0 )
        {
            index += 3;
            closingIndex = address.IndexOf( '&', index );
            if( closingIndex < 0 )
                closingIndex = address.Length;
        }
        else
        {
            index = address.IndexOf( "file/d/" );
            if( index < 0 ) // address is not in any of the supported forms
                return string.Empty;

            index += 7;

            closingIndex = address.IndexOf( '/', index );
            if( closingIndex < 0 )
            {
                closingIndex = address.IndexOf( '?', index );
                if( closingIndex < 0 )
                    closingIndex = address.Length;
            }
        }

        return string.Concat( "https://drive.google.com/uc?id=", address.Substring( index, closingIndex - index ), "&export=download" );
    }

    public void Dispose()
    {
        webClient.Dispose();
    }
}

And here's how you can use it:

// NOTE: FileDownloader is IDisposable!
FileDownloader fileDownloader = new FileDownloader();

// This callback is triggered for DownloadFileAsync only
fileDownloader.DownloadProgressChanged += ( sender, e ) => Console.WriteLine( "Progress changed " + e.BytesReceived + " " + e.TotalBytesToReceive );
// This callback is triggered for both DownloadFile and DownloadFileAsync
fileDownloader.DownloadFileCompleted += ( sender, e ) => Console.WriteLine( "Download completed" );

fileDownloader.DownloadFileAsync( "https://INSERT_DOWNLOAD_LINK_HERE", @"C:\downloadedFile.txt" );

Using a Service Account might work for you.


I faced an issue in direct download because I was logged in using multiple Google accounts. Solution is append authUser=0 parameter. Sample request URL to download :https://drive.google.com/uc?id=FILEID&authuser=0&export=download


Update as of August 2020:

This is what worked for me recently -

Upload your file and get a shareable link which anyone can see(Change permission from "Restricted" to "Anyone with the Link" in the share link options)

Then run:

 SHAREABLE_LINK=<google drive shareable link>
 curl -L https://drive.google.com/uc\?id\=$(echo $SHAREABLE_LINK | cut -f6 -d"/")

Check this out:

wget https://raw.githubusercontent.com/circulosmeos/gdown.pl/master/gdown.pl
chmod +x gdown.pl
./gdown.pl https://drive.google.com/file/d/FILE_ID/view TARGET_PATH