Apify

Apify and Crawlee Official Forum

b
F
A
J
A

How can I use the Playwright Crawler and BeautifulSoup Crawler in the same Actor?

This is so that Playwright can fill in and submit a website search page which uses dynamic Javascript. When the results are shown I want to be able to use the BeautifulSoup crawler to open each product page and parse the information. If I use Playwright to open each product page, this takes a very long time. I cannot seem to run both Crawlers at the same time.
1
H
A
M
7 comments
The link contains the answer
I want to build my own actor with playwright and BeautifulSoup.
I am looking for this exactly solution. first I want to send a Http request and get the HTML and use the beautifulSoup to parse the data and then open the Links (get from parsing the data) using playwrights .

correct me If I am wrong.
first use the python with beautifulSoup and get the results and use those result with Playwright.
so we have to create and build 2 different actor for this ?
Hi @Abdul

The discussion above concerns the use of - crawlee-python

In Actor, you can implement the use of Http client + BeautifulSoup and Playwright, either within a single Actor or using a bundle of two Actors.
@Mantisus just advanced to level 4! Thanks for your contributions! πŸŽ‰
Thanks for clarifying it. do you have anything that will be helpful for me to start working on actor with HTTP client + BeautifulSoup + PlayWright
No, I don't have any code samples like that. Since I don't usually use Playwright and browser automation.

But writing such an Actor is not much different from just writing a scrapper using such a bundle.

Refer to the official documentation to see Playwright instantiation in Actor - https://docs.apify.com/sdk/python/docs/guides/playwright

Add on top of HTTP Client + BeautifulSoup integration will not be a problem.
Add a reply
Sign up and join the conversation on Discord
Join