How to scrap this website in Google Colab?

  Kiến thức lập trình

I am trying to scrap the following website:
https://khadamatt.abfaazarbaijan.ir/

The website is functioning properly on browser, but when I tried to scrap it in Google Colab, I faced many challenges.
At first I realized due to the structure of the website I can’t use a simple .get() using requests library, after a deep search it seemed to me that I need to use Selenium.

Using Selenium needs quite a few configurations which I tried to handle them using the following library:
Github link for Google Colab Selenium

This package seems to work on many URLs, but I still can’t read the content of my URL. I am still getting connection time out error.

My codes are as following:

%pip install google-colab-selenium
import google_colab_selenium as gs
driver = gs.Chrome()
driver.get('https://khadamatt.abfaazarbaijan.ir/')
print(driver.title)
driver.quit()

The output is:


Message: unknown error: net::ERR_CONNECTION_TIMED_OUT
  (Session info: headless chrome=90.0.4430.212)
Stacktrace:
#0 0x5974e6ebf7f9 <unknown>
#1 0x5974e6e5f3b3 <unknown>
#2 0x5974e6ba7016 <unknown>
#3 0x5974e6ba4d4f <unknown>
#4 0x5974e6b921f9 <unknown>
#5 0x5974e6b93045 <unknown>
#6 0x5974e6b92541 <unknown>
#7 0x5974e6b91aba <unknown>
#8 0x5974e6b90ae7 <unknown>
#9 0x5974e6b90f51 <unknown>
#10 0x5974e6ba90d7 <unknown>
#11 0x5974e6c0fd27 <unknown>
#12 0x5974e6bfedc2 <unknown>
#13 0x5974e6c0f9e1 <unknown>
#14 0x5974e6bfec93 <unknown>
#15 0x5974e6bd0ce4 <unknown>
#16 0x5974e6bd24d2 <unknown>
#17 0x5974e6e8b542 <unknown>
#18 0x5974e6e9ace7 <unknown>
#19 0x5974e6e9a9e4 <unknown>
#20 0x5974e6e9f13a <unknown>
#21 0x5974e6e9b5b9 <unknown>
#22 0x5974e6e80e00 <unknown>
#23 0x5974e6eb25d2 <unknown>
#24 0x5974e6eb2778 <unknown>
#25 0x5974e6ecaa1f <unknown>
#26 0x7ec9f6c3fac3 <unknown>
#27 0x7ec9f6cd1850 <unknown>`

I have already tried the solutions of these links:
Unable to scrape this site. How to scrape data from this site?
TypeError: WebDriver.__init__() got an unexpected keyword argument ‘executable_path’ in Selenium Python
TypeError: WebDriver.__init__() got multiple values for argument ‘options’

LEAVE A COMMENT