Apify Discord Mirror

X
Xeno
·

Chrome Path

Hello, can you let me know what the path for Chrome is?
4 comments
A
X
React JS

WARNING in ./node_modules/apify-client/dist/resource_clients/user.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/Users/apple/Desktop/insta_downloader/insta_downloader/node_modules/apify-client/src/resource_clients/user.ts' file: Error: ENOENT: no such file or directory, open '/Users/apple/Desktop/insta_downloader/insta_downloader/node_modules/apify-client/src/resource_clients/user.ts'

WARNING in ./node_modules/apify-client/dist/resource_clients/webhook.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/Users/apple/Desktop/insta_downloader/insta_downloader/node_modules/apify-client/src/resource_clients/webhook.ts' file: Error: ENOENT: no such file or directory, open '/Users/apple/Desktop/insta_downloader/insta_downloader/node_modules/apify-client/src/resource_clients/webhook.ts'

WARNING in ./node_modules/apify-client/dist/resource_clients/webhook_collection.js
Module Warning (from ./node_modules/source-map-loader/dist/cjs.js):
Failed to parse source map from '/Users/apple/Desktop/insta_downloader/insta_downloader/node_modules/apify-client/src/resource_clients/webhook_collection.ts' file: Error: ENOENT: no such file or directory, open '/Users/apple/Desktop/insta_downloader/insta_downloader/node_modules/apify-client/src/resource_clients/webhook_collection.ts'
During my Apify scraping runs with Crawlee / puppeteer, 32GB RAM per run, my jobs stop showing There was an uncaught exception during the run of the Actor and it was not handled.
And the logs you see in the screenshot at the end.
This often happens for runs that are running for 30+ minutes. Under 30 minutes is less likely to have this error.
I've tried "Increase the 'protocolTimeout' setting ", but observed that the error still happens, just after a longer wait.
Tried different concurrency settings, even leaving to default, but consistently have seen this error.

Plain Text
const crawler = new PuppeteerCrawler({
    launchContext: {
        launchOptions: {
            headless: true,
            args: [
                "--no-sandbox", // Mitigates the "sandboxed" process issue in Docker containers,
                "--ignore-certificate-errors",
                "--disable-dev-shm-usage",
                "--disable-infobars",
                "--disable-extensions",
                "--disable-setuid-sandbox",
                "--ignore-certificate-errors",
                "--disable-gpu", // Mitigates the "crashing GPU process" issue in Docker containers
            ],
        },
    },
    maxRequestRetries: 1,
    navigationTimeoutSecs: 60,
    autoscaledPoolOptions: { minConcurrency: 30 },
    maxSessionRotations: 5,
    preNavigationHooks: [
        async ({ blockRequests }, goToOptions) => {
            if (goToOptions) goToOptions.waitUntil = "domcontentloaded"; // Set waitUntil here
            await blockRequests({
                urlPatterns: [
...
                ],
            });
        },
    ],
    proxyConfiguration,
    requestHandler: router,
});
await crawler.run(startUrls);
await Actor.exit();
1 comment
O
Using our own developed PPE actors causes us to appear as paid users on the analytics dashboard. However, using our own PPR and rented actors does not reflect as a paying user. This issue with PPE actors can be confusing for developers, and since there is no actual profit/cost change, it may appear as if the actor has issues with charging.

Additionally, having more detailed indicators for PPE actors in the analytics dashboard would be very beneficial. For example, it would be great to see how much each event is charged per execution for each actor.
Hi, we are trying to upgrade to a paid solution and we can't get the payment through. We checked the billing details and contacted the card company, and there was no issues from their end. They said that there was no payment attempt from Apify. Can you please assist with this issue?
14 comments
O
c
A
I am runnign a twitter scraper actor v2 on apify, and I see that my run succeeded and says 100 resutls,
but when I got to the details page, it is just an array of 100 items of {'demo': true}
how can I get proper details?
1 comment
O
❗ Guys, was something recently released or changed at Apify related to actors resources, etc.? I have an actor that has been running fine for a while, but in the past few days, migrations have become frequent, causing issues for some of my paid actor users. ⚠️
1 comment
O
Parameter name containing a dot (.) with editor stringList doesn't work on web console.

Example INPUT_SCHEMA.JSON
Plain Text
{
    "title"            : "Test",
    "type"            : "object",
    "schemaVersion"    : 1,
    "properties"    : {        
        
        "search.location": {"title": "Locations #1", "type": "array", "description":"", "editor":"stringList", "prefill": ["Bandung"]}, ### <-- Problem
        
        "search_location": {"title": "Locations #2", "type": "array", "description":"", "editor":"stringList", "prefill": ["Bandung"]}
    }
}

check Actor-ID: acfF0psV9y4e9Z4hq
Can't click the +Add button. When edited using Bulk button, the resulting Json is weird. It automatically become Object Structure which is nice effect. not sure if this really a Bug, or new features ?
2 comments
O
!
I want an apify actor that scrapes and returns LinkedIn geolocation ID as output from the input location name. Is there any such actor available in the apify store or any platform in general?
2 comments
O
k
input_schema.json
'''
{
"title": "Base64 Image Processor",
"type": "object",
"schemaVersion": 1,
"properties": {
"files": {
"type": "array",
"description": "Array of file objects to process",
"items": {
"type": "object",
"properties": {
"file": {
"type": "object",
"properties": {
"name": {"type": "string"},
"type": {"type": "string"},
"size": {"type": "integer"},
"content": {"type": "string"},
"description": {"type": "string"}
},
"required": ["name", "type", "size", "content", "description"]
}
},
"required": ["file"]
}
}
},
"required": ["files"]
}
'''


run and start show error
2025-03-16T08:19:28.275Z ACTOR: ERROR: Input schema is not valid (Field schema.properties.files.enum is required)

need help
1 comment
O
I am trying to scrape particular website
but it seems to have cloudfare or some advanced firewall preventing any bot or automated script.

Please guide me with a strategy that will work against such advanced methods.
1 comment
O
Hey everyone! 👋

I'm running into some trouble getting VNC to connect to my Docker container. Using apify/actor-node-playwright-chrome and running it as-is, but no luck in headful mode. The chrome_test.js and main.js run perfectly but VNC and Remote Debugging is not working.

I'm on windows 11, using vscode, wsl2, docker-desktop.I tried pulling the image from docker repo but then, I built the image on Ubuntu distro via WSL2 and Doker-Desktop with WSL integration enabled.

Here’s what I’ve tried so far:

Modified chrome_test.js to add a delay when headless: false
Exposed the necessary ports
Removed -nolisten tcp from both VNC servers
Still can’t connect via VNC (RealVNC) or Chrome Remote Debugging
Is the image missing something like a vnc server?
Does the xvfb/xvfb-run serves as the vnc server. Cause it usually used with a vnc server like X11vnc.
I exposed as will Chrome Remote Debugging port with out a success to establish a connection.

Not sure what I’m missing. Trying to set up Docker properly before diving into actor development. Anyone run into this before? Would appreciate any tips! 🙏
2 comments
O
m
While running my code on Apify IDE I'm getting this error.

The build was successful.
3 comments
S
M
A
Hi! This is my url:
https://api.apify.com/v2/acts/crypto-scraper~dexscreener-tokens-scraper/run-sync-get-dataset-items?token=<my-token>
Body:
{
"chainName": "solana",
"filterArgs": [
"?rankBy=trendingScoreH24&order=desc",
"?rankBy=marketCap&order=desc&limit=10&minMarketCap=1"
],
"fromPage": 1,
"toPage": 1
}
I want to limit fetching data only to 100 or less

I changed my url with: https://api.apify.com/v2/acts/crypto-scraper~dexscreener-tokens-scraper/run-sync-get-dataset-items?token=<my-token>&limit=100

But it is still Runs more that 100

Can someone experienced this? What am I doing wrong? Thanks in advance!
2 comments
A
I have this in my .actor/pay_per_event.json and my calling this in my main.py. And i do get this warning in my terminal 2025-03-08T14:09:14.994Z [apify] WARN Ignored attempt to charge for an event - the Actor does not use the pay-per-event pricing
If i use await Actor.charge('actor-start-gb '), will be correctly using PPE. Please let me know. thank you in advance


{ "actor-start": { "eventTitle": "Price for Actor start", "eventDescription": "Flat fee for starting an Actor run.", "eventPriceUsd": 0.1 }, "task-completed": { "eventTitle": "Price for completing the task", "eventDescription": "Flat fee for completing the task.", "eventPriceUsd": 0.4 } }

main.py
async def main(): """Runs the AI Travel Planner workflow.""" async with Actor: await Actor.charge('actor-start') actor_input = await Actor.get_input() or {} Actor.log.info(f"Received input: {actor_input}") travel_query = TravelState(**actor_input) # Execute workflow final_state = travel_workflow.invoke(travel_query) Actor.log.info(f"Workflow completed. Final state: {final_state}") await Actor.charge('task-completed') # Save the final report await save_report(final_state)
7 comments
S
A
B
Getting an error for a basic crawler when passing in my starting arguments.

It says it the input must contain "url", which it does already.

Plain Text
2025-03-07T21:22:12.478Z ACTOR: Pulling Docker image of build aJ5w2MnrBdaZRxGeA from repository.
2025-03-07T21:22:13.611Z ACTOR: Creating Docker container.
2025-03-07T21:22:13.835Z ACTOR: Starting Docker container.
2025-03-07T21:22:14.208Z Starting X virtual framebuffer using: Xvfb :99 -ac -screen 0 1920x1080x24+32 -nolisten tcp
2025-03-07T21:22:14.210Z Executing main command
2025-03-07T21:22:15.368Z INFO  System info {"apifyVersion":"3.3.2","apifyClientVersion":"2.12.0","crawleeVersion":"3.13.0","osType":"Linux","nodeVersion":"v20.18.3"}
2025-03-07T21:22:15.498Z INFO  Starting the crawl process {"startUrls":[{"url":"https://salesblaster.ai"}],"maxRequestsPerCrawl":100,"datasetName":"default"}
2025-03-07T21:22:15.905Z ERROR Error running scraper: {"error":"Request options are not valid, provide either a URL or an object with 'url' property (but without 'id' property), or an object with 'requestsFromUrl' property. Input: {\n  url: { url: 'https://salesblaster.ai' },\n  userData: {\n    datasetName: 'default',\n    initialUrl: { url: 'https://salesblaster.ai' }\n  }\n}"}
1 comment
A
Hey everyone,

I have built an Instagram Scraper using Selenium and Chrome that works perfectly until I deploy it as an actor here on Apify.

It signs in fine but fails every time no matter what I do or try when it gets to the Search button.

I have iterated through:

1) search_icon = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.CSS_SELECTOR, "svg[aria-label='Search']"))
)
search_icon.click()

-----

2) search_icon = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//span[contains(., 'Search')]"))
)
search_icon.click()

-----

3) search_icon = WebDriverWait(driver, 20).until(
EC.element_to_be_clickable((By.XPATH, "//svg[@aria-label='Search']"))
)
search_icon.click()

----

4) try:
search_button = WebDriverWait(driver, 30).until(
EC.element_to_be_clickable((
By.XPATH,
"//a[.//svg[@aria-label='Search'] and .//span[normalize-space()='Search']]"
))
)
# Scroll the element into view just in case
driver.execute_script("arguments[0].scrollIntoView(true);", search_button)
search_button.click()
except TimeoutException:
print("Search button not clickable.")

----

5) search_button = WebDriverWait(driver, 30).until(
EC.element_to_be_clickable((
By.XPATH,
"//a[.//svg[@aria-label='Search'] and .//span[normalize-space()='Search']]"
))
)
driver.execute_script("arguments[0].scrollIntoView(true);", search_button)
search_button.click()


And I have have tried all of these with residential proxies, data center proxies and at different timeout lengths, NOTHING works and there is nothing that I can find in the documentation to help with this issue.

does anyone have any insight into this??

I'd understand if this was failing to even sign in but it is failing at the search button, is the page rendered differently for Apify than it is if your running this from your computer maybe?
2 comments
M
В
I have a public actor and some of my users experience that either default and/or named datasets don't seem to be existing and somehow won't be created when pushing data to them.
This is the error message I can see affecting only a handful of user runs:

Plain Text
ERROR PlaywrightCrawler: Request failed and reached maximum retries. ApifyApiError: Dataset was not found
2025-03-06T17:37:21.112Z   clientMethod: DatasetClient.pushItems
2025-03-06T17:37:21.113Z   statusCode: 404
2025-03-06T17:37:21.115Z   type: record-not-found
2025-03-06T17:37:21.119Z   httpMethod: post
2025-03-06T17:37:21.120Z   path: /v2/datasets/<redacted>/items
2025-03-06T17:37:21.122Z   stack:
2025-03-06T17:37:21.124Z     at makeRequest (/home/myuser/node_modules/apify-client/dist/http_client.js:187:30)
2025-03-06T17:37:21.125Z     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2025-03-06T17:37:21.127Z     at async DatasetClient.pushItems (/home/myuser/node_modules/apify-client/dist/resource_clients/dataset.js:104:9)
2025-03-06T17:37:21.129Z     at async processSingleReviewDetails (file:///home/myuser/dist/helperfunctions.js:365:5)
2025-03-06T17:37:21.131Z     at async Module.processReviews (file:///home/myuser/dist/helperfunctions.js:379:13)
2025-03-06T17:37:21.133Z     at async getReviews (file:///home/myuser/dist/main.js:37:5)
2025-03-06T17:37:21.135Z     at async PlaywrightCrawler.requestHandler [as userProvidedRequestHandler] (file:///home/myuser/dist/main.js:98:13)
2025-03-06T17:37:21.137Z     at async wrap (/home/myuser/node_modules/@apify/timeout/cjs/index.cjs:54:21)
2025-03-06T17:37:21.139Z   data: undefined {"id":"<redacted>","url":"<redacted>?sort=recency&languages=all","method":"GET","uniqueKey":"https://www.trustpilot.com/review/<redacted>?languages=all&sort=recency"}
`

How can I ensure that the datasets are created ahead of time when running the scraper before it collects data and then fails because the dataset cant be created or does not exist?
6 comments
P
C
Hey everyone! 👋

I'm using a public Apify Actor in a Make.com scenario and want to pass a custom tag ({{custom_tag}}) in the input JSON when running the Actor. The goal is to have this tag included in the webhook payload that Apify sends back when the run completes.

I can see that when I am starting the Apify Actor, that the {{custom_tag}} is included in the input to the Apify Actor but I do not know how I can get the Actor to output this {{custom_tag}} in its payload. I was thinking of using a custom webhook payload template to manually add {{custom_tag}} but I cannot find any solution to this.

Has anyone successfully done this before? Would love to hear how you approached it! 🚀
2 comments
P
Y
Hey all I have a question about whether I can actually use Apify to access Google Gemini for video analyzation:

I've built my own python version of the Gemini Video Analyzer Applet that analyzes social media videos for content style, structure, and aesthetic qualities and it works, I have installed all the Google dependencies required but when I try to run it as an actor using "apify run --purge" no matter what I do it says no module named google found.

Is this a bug with Apify ?

There is no explicit "Google" folder in the Lib\site-packages but when I check the file path it is there:

PS C:\Users\Ken\Apify\run> pip show google-generativeai
Name: google-generativeai
Version: 0.5.2
Summary: Google Generative AI High level API client library and tools.
Home-page: https://github.com/google/generative-ai-python
Author: Google LLC
Author-email: googleapis-packages@google.com
License: Apache 2.0
Location: C:\Users\Ken\AppData\Local\Programs\Python\Python313\Lib\site-packages
Requires: google-ai-generativelanguage, google-api-core, google-api-python-client, google-auth, protobuf, pydantic, tqdm, typing-extensions
Required-by:
PS C:\Users\Ken\Apify\run>

Has anyone else run into this issue ?

I was excited to try and recreate my agent team that I already use on Apify, but I keep running into all these problems I haven't had anywhere else and I'm starting to wonder if it's worth putting in the time to continue using Apify. Don't get wrong I think Apify is great for launching simple things like a youtube scraper etc. but for things like deploying a 30 Agent Team as an app I am starting to wonder if the learning curve for using Apify to do this is worth the time?
8 comments
j
K
Message => For better security, please use the "Sign up with Google" button to sign up with your Gmail account.

I don't want to use Google SSO, I'm stuck at the sign-up form
1 comment
P
Hello,

In general on Apify, it is possible to force the browser language? (example: fr or es or en)

I have tested with:

Plain Text
"proxy": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"],
    "countryCode": "FR",
    "apifyProxyCountry": "FR"
  },
  "proxyConfiguration": {
    "useApifyProxy": true,
    "apifyProxyGroups": ["RESIDENTIAL"],
    "countryCode": "FR",
    "apifyProxyCountry": "FR"
  },
4 comments
P
N
Hey all, I successfully deployed one actor yesterday and followed all the same steps to deploy my next actor but now the Apify CLI can not detect python anymore when I run "apify run" which is crazy because it has to detect it in order to build the actor in the first place,

This is output in my terminal which shows that it can't detect python but that I can find the version no problem:

PS C:\Users\Ken\New PATH py\testing-it> apify run --purge
Info: All default local stores were purged.
Error: No Python detected! Please install Python 3.9 or higher to be able to run Python Actors locally.
PS C:\Users\Ken\New PATH py\testing-it> python --version
Python 3.13.2

Here's my pyenv.cfg:

home = C:\Program Files\Python313
include-system-site-packages = false
version = 3.13.2
prompt = 'testing-it'
executable = C:\Program Files\Python313\python.exe
command = C:\Program Files\Python313\python.exe -m venv --prompt="." C:\Users\Ken\New PATH py\testing-it.venv

This is from my CMD:
C:\Users\Ken>where python
C:\Program Files\Python313\python.exe

I have added the correct PATHs in my environment variables, uninstalled and reinstalled python, started new apify builds, restarted my computer, and deactivated the Microsoft App Aliases.

The best I can figure is somewhere in the code something is pointing the apify run to somewhere different then C:\Program Files\Python313\python.exe because that is where it is and where it has clearly been found by the apify CLI.

Has this ever happened to anyone else?
9 comments
M
A
K
Hi,
I am using the actor Website Content Crawler (apify/website-content-crawler) to scrape a few thousand urls. These are a predecided list of a urls, so the depth is set to 0. I saw a few of these fail. Is there any way to get access to these failed urls from either the apify site UI or the integration? The Dataset under Storage only gives the successful urls.
4 comments
M
V
P
New to the tool, very cool. When I dropped the link I wanted to scrape inside Actor it came out perfect. I'd like to build this into an automation though via Zapier. Ideally, I submit a google form with the link to the goodreads URL i want to scrape, zapier will do a webhook and pull the info, and then add it where I need it to go. I was using a tool for amazon with a format like this and wondering if there is a similar way to do it for this tool? https://api.scraperapi.com/?api_key={{api-key}}&url={{URL}}
1 comment
P