How To Extract Source Html From Webpage?
I am trying to extract the html source of this page, http://www.fxstreet.com/rates-charts/currency-rates/ I want what I see when I save the page from chrome as a .html file. I t
Solution 1:
Try using HtmlUnit and setting setJavascriptEnabled(true)
JSoup isn't headless browser to execute Javascript so you must choose other library to get the page and then you can use JSoup to parse it.
Solution 2:
Just to extract the main table can be easily done using Jsoup
here's a method that will take all the content from the main table on the page
publicvoidparse(){
try{
Document doc = Jsoup.connect("http://www.fxstreet.com/rates-charts/currency-rates/").get();
Element content = doc.getElementById("ddlPairsChoose");
Elements table = doc.getElementsByClass("applet-content");
System.out.print(table);
}
catch(Exception e){
System.out.print("error --> " + e);
}
}
It prints out the table on the page
Post a Comment for "How To Extract Source Html From Webpage?"