Apify and Crawlee Official Forum

Updated 3 months ago

SSL Error

Hi Guys! I'm getting SSL error when I put TripAdvisor URL, if I put other URL that works fine, please help me out.
Attachment
image.png
2
P
D
L
35 comments
Hi , based on the error I would expect that you get Timeouted by tripadvisor, are you using any antibot-detection approach, such as proxies, fingeprints, etc.?
no it's a simple code for making request.
it works for other url but not for trip advisor.
can you share reproducible example of code?
I don't think this is related to Apify directly, might be some config in requests library or similar. We need to see the code ofc.
This is the code picture. Please check.
Attachment
IMG_3465.jpg
this is the boiler plate of apify.
just advanced to level 2! Thanks for your contributions! πŸŽ‰
please take a look.
Hi lukas hope you're doing well, can you please take a look.
I will try to reproduce it
thanks, waiting for your reply.
Hi have you check , please let me know. Thank you
I can reproduce it, checking with the team
thank you really appreciate it.
what is the error? are you able to check it
Hi Lukas Please reply I need this working so I can run my Crawler, please help me.
Sorry, this might take a while before the Python team figures this out cc
Thanks for the update. Really appreciate it.
just advanced to level 3! Thanks for your contributions! πŸŽ‰
Maybe using requests library or something else would fix it
or just finding ignore SSL option
Any example code?
with little help from chatgpt πŸ™‚
from urllib.parse import urljoin from bs4 import BeautifulSoup import requests # Define a function to scrape the given URL up to the specified maximum depth def scrape(url, depth, max_depth): if depth > max_depth: return print(f'Scraping {url} at depth {depth}...') # Try to send a GET request to the URL try: response = requests.get(url) soup = BeautifulSoup(response.content, 'html.parser') # If we haven't reached the max depth, look for nested links and enqueue their targets if depth < max_depth: for link in soup.find_all('a'): link_href = link.get('href') if link_href and link_href.startswith(('http://', 'https://')): link_url = urljoin(url, link_href) print(f'Found link: {link_url}') scrape(link_url, depth + 1, max_depth) # Extract and print the title of the page title = soup.title.string if soup.title else "No Title" print(f'Title: {title}') except requests.exceptions.RequestException as e: print(f'An error occurred: {e}') # Main function to start scraping def main(): start_urls = [{'url': 'https://www.tripadvisor.com/'}] # Example start URL max_depth = 1 # Example max depth # Start scraping from the first URL for start_url in start_urls: url = start_url.get('url') print(f'Starting scrape for: {url}') scrape(url, 0, max_depth) if __name__ == "__main__": main()
Thanks will check this..
site is secured by ssl so we cannot ignore it i think.
stuck on request...
JS libraries allow to ignore SSL errors, let me check
I’m using python.
the link is for the Python httpx library
let me take a look.
Add a reply
Sign up and join the conversation on Discord