0

I have an S3 bucket containing 3 files. I want to push the contents of these 3 files into an array. The problem is using my code I get what seems to be a multi-dimensional array that returns duplicates of some of the files. To be clear in the actual S3 bucket there are no duplicates. I only want one entry of each file to be pushed into the array.

const AWS = require('aws-sdk'); // installs sdk for javascript
AWS.config.update({region:'region'});
const s3 = new AWS.S3();

let allKeys = []; // This array stores all the key names from the to_be_processed S3 bucket
    
const bucketName = 'bucket-name'

const bucketParamsl = {
  Bucket: bucketName, /* required */
  //Delimiter: '/',
  Prefix: 'to_be_processed/',
  StartAfter: 'to_be_processed/'
};
//method lists objects in an s3 bucket
s3.listObjectsV2(bucketParamsl, function(err, data) {
  if (err) {console.log(err, err.stack); }// an error occurred
  else { console.log(data); 
    let contents = data.Contents;
    contents.forEach(function (content) {
      allKeys.push(content.Key);
      console.log(allKeys)
    })     
  } ;
})

Image of response when code is run

***Edit Inside the list object function I added this code :

for (let x = 0; x < allKeys.length; x ++){ 
    const bucketParams = {
      Bucket: bucketName,
      //Delimiter: '/',
      Key: allKeys[x].toString(),
  };
  //Writes object from bucket to array result.

  const stream = s3.getObject(bucketParams)
  .createReadStream()
  .pipe(csv({}))
  .on('data', (data) => result.push(data))
  .on('end', () => { 
   for (let i = 0; i < result.length; i ++){
       result[i].Email
    console.log(result[i].Email)
    }
}

)
};

In this part of the program, the contents of the S3 bucket- containing multiple csv files-are pushed into an array and from that array, the program is instructed to log all the emails listed to the console. The problem is, when there is more than one csv file in the bucket, there are duplicate entries of some emails being pushed into the array. Please note that each csv file is unique and none of them contains duplicate emails themselves. How can I make it so there are no duplicates?

Image showing duplicate emails pushed in array.

1 Answer 1

1

Move console.log(allKeys) out the forEach loop.

Sign up to request clarification or add additional context in comments.

1 Comment

lol easy fix, thank you. Could you take a look at my edit please? @Tomasz

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.