Apify

Apify and Crawlee Official Forum

b
F
A
J
A

'undefined' in DataSet it is keeping me from exporting data

when i run Dataset.getData() i keep finding one of the value as 'undefined'
this is preventing me from being able to Dataset.exportToCSV() because it complains about this undefined value.
is there a ways to clean the Dataset so that undefined value does not appear?
or to know why this is happening?
1
A
G
A
10 comments
is it in your own data items? can you share full error message?
Plain Text
/home/user/compras_publicas_scrapper/node_modules/csv-stringify/dist/cjs/sync.cjs:322
        return Error(`Invalid Record: expect an array or an object, got ${JSON.stringify(chunk)}`);
               ^

Error: Invalid Record: expect an array or an object, got undefined
    at Object.__transform (/home/user/compras_publicas_scrapper/node_modules/csv-stringify/dist/cjs/sync.cjs:322:16)
    at stringify (/home/user/compras_publicas_scrapper/node_modules/csv-stringify/dist/cjs/sync.cjs:553:21)
    at Dataset.exportTo (/home/telix/compras_publicas_scrapper/node_modules/@crawlee/core/storages/dataset.js:253:48)
    at async Dataset.exportToCSV (/home/user/compras_publicas_scrapper/node_modules/@crawlee/core/storages/dataset.js:286:9)
    at async Dataset.exportToCSV (/home/user/compras_publicas_scrapper/node_modules/@crawlee/core/storages/dataset.js:306:9)
    at async file:///home/user/compras_publicas_scrapper/src/main.js:29:1

Node.js v19.1.0
I poked around a little and this happend when one of the values in Datase is undefined
for example the out up of await Dataset.getData() would be something like this
Plain Text
{                                                                                            
  count: 68,   
  desc: false,                                                                               
  items: [  
{                                                
                                                                                    
    {                                                
      'Descripción': [Object],                       
      Fechas: [Object],                              
      Productos: [Object],                           
      'Parámetros de Calificación': [Object],        
      Archivos: [Array],                             
      url: 'https://www.compraspublicas.gob.ec/ProcesoContratacion/compras/PC/informacionProcesoContratacion2.cpe?idSoliCompra=GILqItCW52eDlBQxzLpaLlDdArmhJscPjPHoxQjiTfA,'
    },                                               
    undefined,                                       
    {                                                
      'Descripción': [Object],                       
      Fechas: [Object],                              
      Productos: [Object],                           
      'Parámetros de Calificación': [Object],        
      Archivos: [Array],                             
      url: 'https://www.compraspublicas.gob.ec/ProcesoContratacion/compras/PC/informacionProcesoContratacion2.cpe?idSoliCompra=vRmOB40vA8u58yblEZpsmW3AMHl8RaM3CHv6oQPAf5Y,'
    },  
],
  limit: 999999999999,
  offset: 0,
  total: 68
}
just advanced to level 1! Thanks for your contributions! 🎉
You can try getting DS items with https://crawlee.dev/api/core/interface/DatasetDataOptions#clean but root cause of error is that you have corrupted item, undefined is not expected as item value, it supposed to be json object
This is a bug in Crawlee so wil lbe fixed
Would you please send me privately the dataset ID? I would want to see how the undefined got there
Add a reply
Sign up and join the conversation on Discord
Join