SSL Error

Hi Guys! I'm getting SSL error when I put TripAdvisor URL, if I put other URL that works fine, please help me out.

Attachment

35 comments

Hi , based on the error I would expect that you get Timeouted by tripadvisor, are you using any antibot-detection approach, such as proxies, fingeprints, etc.?

DDeleted User

no it's a simple code for making request.

DDeleted User

it works for other url but not for trip advisor.

PPepa J

can you share reproducible example of code?

LLukas Krivka

I don't think this is related to Apify directly, might be some config in requests library or similar. We need to see the code ofc.

DDeleted User

This is the code picture. Please check.

Attachment

DDeleted User

this is the boiler plate of apify.

AApifyBot

just advanced to level 2! Thanks for your contributions! 🎉

DDeleted User

please take a look.

DDeleted User

Hi lukas hope you're doing well, can you please take a look.

LLukas Krivka

I will try to reproduce it

DDeleted User

thanks, waiting for your reply.

DDeleted User

Hi have you check , please let me know. Thank you

LLukas Krivka

I can reproduce it, checking with the team

DDeleted User

thank you really appreciate it.

DDeleted User

what is the error? are you able to check it

DDeleted User

Hi Lukas Please reply I need this working so I can run my Crawler, please help me.

LLukas Krivka

Sorry, this might take a while before the Python team figures this out cc

DDeleted User

Thanks for the update. Really appreciate it.

DDeleted User

Any luck? Guys

AApifyBot

just advanced to level 3! Thanks for your contributions! 🎉

aanon_

might be related to this: https://github.com/encode/httpx/discussions/2602

LLukas Krivka

Maybe using requests library or something else would fix it

LLukas Krivka

or just finding ignore SSL option

DDeleted User

Any example code?

aanon_

with little help from chatgpt 🙂

from urllib.parse import urljoin
from bs4 import BeautifulSoup
import requests

# Define a function to scrape the given URL up to the specified maximum depth
def scrape(url, depth, max_depth):
    if depth > max_depth:
        return
    
    print(f'Scraping {url} at depth {depth}...')

    # Try to send a GET request to the URL
    try:
        response = requests.get(url)
        soup = BeautifulSoup(response.content, 'html.parser')

        # If we haven't reached the max depth, look for nested links and enqueue their targets
        if depth < max_depth:
            for link in soup.find_all('a'):
                link_href = link.get('href')
                if link_href and link_href.startswith(('http://', 'https://')):
                    link_url = urljoin(url, link_href)
                    print(f'Found link: {link_url}')
                    scrape(link_url, depth + 1, max_depth)

        # Extract and print the title of the page
        title = soup.title.string if soup.title else "No Title"
        print(f'Title: {title}')

    except requests.exceptions.RequestException as e:
        print(f'An error occurred: {e}')

# Main function to start scraping
def main():
    start_urls = [{'url': 'https://www.tripadvisor.com/'}]  # Example start URL
    max_depth = 1  # Example max depth

    # Start scraping from the first URL
    for start_url in start_urls:
        url = start_url.get('url')
        print(f'Starting scrape for: {url}')
        scrape(url, 0, max_depth)

if __name__ == "__main__":
    main()

DDeleted User

Thanks will check this..

DDeleted User

site is secured by ssl so we cannot ignore it i think.

DDeleted User

stuck on request...

LLukas Krivka

JS libraries allow to ignore SSL errors, let me check

LLukas Krivka

Try this - https://stackoverflow.com/questions/68702930/how-to-turn-off-ssl-verification-for-authlib-client-with-httpx-starlette

DDeleted User

I’m using python.