Apify and Crawlee Official Forum

Updated 3 months ago

Loading files along with HTML-scraped content via LangChain's ApifyDatasetLoader

The ApifyDatasetLoader for LangChain loads the records, which include the text, metadata, and fileUrl fields. All of the examples show loading content via the text or metadata fields β€” but what about fileUrl? Assuming the run has records for PDF, XLSX, and/or other files, is there an example of how to load those files alongside the scraped HTML content?
A
a
2 comments
Its outside of SDK functionality: https://llamahub.ai/l/apify-dataset check their git or post quiestion there I guess
Got it, thanks, will check via the integration repo.
Add a reply
Sign up and join the conversation on Discord