Author

Topic: Python wait before returning contents of web-page (Read 92 times)

sr. member
Activity: 504
Merit: 297
CryptoTalk.Org - Get Paid for every Post!
Hey bud.

This is in the wrong section. But I'll answer anyway.

You are making this waaaaaayy too hard.

Call this URL: http://www.fivb.org/Vis/Public/JS/Beach/TechPlayRank.aspx?Gender=1&id=BTechPlayW&Date=20180326
The "Date" field here, is 4 digits year + 2 digits month + 2 digits day.
Valid Gender is 0, or 1. The page you linked to was using 1, as in the link.
The "id" field changes the source code of the JS file returned, to specify which element the data should be loaded into. (I believe).

That is the data that the page is loading (as you correctly guessed) via AJAX.
Now, the data that you are getting pulled in is actually a JS file. So, have fun parsing that.

But all of the data is there, and you have a more efficient way of scraping the data.
(I'm assuming that you DO have permission to be scraping this data, and if not, you should probably make sure you do.)

99.9% of the time, when you face a difficult task, there's an easier way to be doing it. Wink
Let me know what you think of the solution.
newbie
Activity: 4
Merit: 0
I'm trying to scrape this website: http://www.fivb.org/EN/BeachVolleyball/PlayersRanking_W.asp , but this page loads the contents of the tabel (probably through ajax), after the page has been loaded.

My attempt:

import requests
from bs4 import BeautifulSoup, Comment
uri = 'http://www.fivb.org/EN/BeachVolleyball/PlayersRanking_W.asp'

r = requests.get(uri)
soup = BeautifulSoup(r.content)
print(soup)
But the div with the id='BTechPlayM' remains empty, regardless of what I do. I've tried:

Setting a timeout on the request: Python requests.get(uri, timeout=10)
Passing headers
Using eventlet, to set a delay
And the latest thing was to try and use the selenium-library, to use PhantomJS (installed from NPM), but this rabbit-whole just kept going deeper and deeper.
Are there a way to send a request to a URI, wait X seconds, and return the contents then?

... Or to send a request to a URI, keep checking if a div contains an element; and only return the contents, whenever it does?
Jump to: