crawlee doesnt seem to respect resource limits imposed by cgroups. This poses problems for containerised enviroments where ethier crawlee gets oom killed or silently slows to a crawl as it thinks it has much more resource available then it actually does. reading and setting the maximum ram is pretty easy
function getMaxMemoryMB(): number | null {
const cgroupPath = '/sys/fs/cgroup/memory.max';
if (!existsSync(cgroupPath)) {
log.warning('Cgroup v2 memory limit file not found.');
return null;
}
try {
const data = readFileSync(cgroupPath, 'utf-8').trim();
if (data === 'max') {
log.warning('No memory limit set (cgroup reports "max").');
return null;
}
const maxMemoryBytes = parseInt(data, 10);
return maxMemoryBytes / (1024 * 1024); // Convert to MB
} catch (error) {
log.exception(error as Error, 'Error reading cgroup memory limit:');
return null;
}
}
this can then be used to set a reasonable RAM limit for crawlee however, the CPU limits are proving more difficult. Has anyone found a fix yet?