Apify Discord Mirror

Updated 6 months ago

Keyvaluestore file extensions

At a glance

A community member is trying to configure a Keyvalue store to save files with the .mhtml extension, but the code they provided always saves the files with a .bin extension instead. They have tried using the multipart/related content type, but it did not work. The community members discuss the issue and provide a sample code snippet, but they conclude that the issue is likely due to the mime-types and content-type packages not supporting the .mht/.mhtml file extensions, and that it would need to be added to those packages for a solution. There is no explicitly marked answer.

Hi,

How do I configure Keyvalue store to have the .mhtml file extension? Using the code below seems to always set it to .bin extension

Plain Text
await KeyValueStore.setValue("some-name", data, {
  contentType: `application/x-mimearchive` \\ multipart/related doesn't work either,
});

// data above is either text or buffer


The goal is to have a file written to KV similar to some-name.mhtml

Thanks!
m
A
A
14 comments
Hi can you please assist me with this? πŸ™‚
have you tried using multipart/related as mimetype?
Yes I did, it saved the file as some-name.bin which is really weird
just advanced to level 3! Thanks for your contributions! πŸŽ‰
oh sorry missed the comment above
could you send the whole minimum repro?
(I mean the page url, how you grab the content, etc)
Plain Text
// Inside the requestHandler 
const cdpSession = await page.target().createCDPSession();
      await cdpSession.send('Page.enable');
      const { data } = await cdpSession.send('Page.captureSnapshot', { format: 'mhtml' });

      const key = `file-${Math.floor(
        Math.random() * 0xffffff
      ).toString(16)}.mhtml`;
      await KeyValueStore.setValue(key, data, {
        contentType: `multipart/related`,
      });
URL: http://example.com/
I'd say immediately you can't do anything about it. If I remember correctly - mime-types and content-type packages are used by crawlee, and .mht/.mhtml do not return any matching mime type, therefore when you provide this miime-type - it falls back to .bin. I'll pass it to the team, but would not expect that it will be changed/updated straight away
Thanks for your help! πŸ˜„
I just tried to run it through above packages, and indeed it does not return any mime-type. So it basically needs to be added to the packages itself... But it someone from the team will come with some solution - I will write back
Alright, thanks !
Add a reply
Sign up and join the conversation on Discord