[html] Scraping data from website using vba

There are several ways of doing this. This is an answer that I write hoping that all the basics of Internet Explorer automation will be found when browsing for the keywords "scraping data from website", but remember that nothing's worth as your own research (if you don't want to stick to pre-written codes that you're not able to customize).

Please note that this is one way, that I don't prefer in terms of performance (since it depends on the browser speed) but that is good to understand the rationale behind Internet automation.

1) If I need to browse the web, I need a browser! So I create an Internet Explorer browser:

Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")

2) I ask the browser to browse the target webpage. Through the use of the property ".Visible", I decide if I want to see the browser doing its job or not. When building the code is nice to have Visible = True, but when the code is working for scraping data is nice not to see it everytime so Visible = False.

With appIE
    .Navigate "http://uk.investing.com/rates-bonds/financial-futures"
    .Visible = True
End With

3) The webpage will need some time to load. So, I will wait meanwhile it's busy...

Do While appIE.Busy
    DoEvents
Loop

4) Well, now the page is loaded. Let's say that I want to scrape the change of the US30Y T-Bond: What I will do is just clicking F12 on Internet Explorer to see the webpage's code, and hence using the pointer (in red circle) I will click on the element that I want to scrape to see how can I reach my purpose.

enter image description here

5) What I should do is straight-forward. First of all, I will get by the ID property the tr element which is containing the value:

Set allRowOfData = appIE.document.getElementById("pair_8907")

Here I will get a collection of td elements (specifically, tr is a row of data, and the td are its cells. We are looking for the 8th, so I will write:

Dim myValue As String: myValue = allRowOfData.Cells(7).innerHTML

Why did I write 7 instead of 8? Because the collections of cells starts from 0, so the index of the 8th element is 7 (8-1). Shortly analysing this line of code:

  • .Cells() makes me access the td elements;
  • innerHTML is the property of the cell containing the value we look for.

Once we have our value, which is now stored into the myValue variable, we can just close the IE browser and releasing the memory by setting it to Nothing:

appIE.Quit
Set appIE = Nothing

Well, now you have your value and you can do whatever you want with it: put it into a cell (Range("A1").Value = myValue), or into a label of a form (Me.label1.Text = myValue).

I'd just like to point you out that this is not how StackOverflow works: here you post questions about specific coding problems, but you should make your own search first. The reason why I'm answering a question which is not showing too much research effort is just that I see it asked several times and, back to the time when I learned how to do this, I remember that I would have liked having some better support to get started with. So I hope that this answer, which is just a "study input" and not at all the best/most complete solution, can be a support for next user having your same problem. Because I have learned how to program thanks to this community, and I like to think that you and other beginners might use my input to discover the beautiful world of programming.

Enjoy your practice ;)

Examples related to html

Embed ruby within URL : Middleman Blog Please help me convert this script to a simple image slider Generating a list of pages (not posts) without the index file Why there is this "clear" class before footer? Is it possible to change the content HTML5 alert messages? Getting all files in directory with ajax DevTools failed to load SourceMap: Could not load content for chrome-extension How to set width of mat-table column in angular? How to open a link in new tab using angular? ERROR Error: Uncaught (in promise), Cannot match any routes. URL Segment

Examples related to vba

Copy filtered data to another sheet using VBA Better way to find last used row Check if a value is in an array or not with Excel VBA Creating an Array from a Range in VBA Excel: macro to export worksheet as CSV file without leaving my current Excel sheet VBA: Convert Text to Number What's the difference between "end" and "exit sub" in VBA? Rename Excel Sheet with VBA Macro Extract Data from PDF and Add to Worksheet Quicker way to get all unique values of a column in VBA?

Examples related to excel

Python: Pandas pd.read_excel giving ImportError: Install xlrd >= 0.9.0 for Excel support Converting unix time into date-time via excel How to increment a letter N times per iteration and store in an array? 'Microsoft.ACE.OLEDB.16.0' provider is not registered on the local machine. (System.Data) How to import an Excel file into SQL Server? Copy filtered data to another sheet using VBA Better way to find last used row Could pandas use column as index? Check if a value is in an array or not with Excel VBA How to sort dates from Oldest to Newest in Excel?

Examples related to web-scraping

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org How to print an exception in Python 3? What should I use to open a url instead of urlopen in urllib3 Use Excel VBA to click on a button in Internet Explorer, when the button has no "name" associated How to use Python requests to fake a browser visit a.k.a and generate User Agent? Scraping data from website using vba Using BeautifulSoup to extract text without tags Is it ok to scrape data from Google results? What's the best way of scraping data from a website? Use getElementById on HTMLElement instead of HTMLDocument