AWS S3
Periodically, AWS S3 files are copied to Google Cloud Storage. There are many ways to copy, but here we use the gsutil command for the Google Cloud SDK, The Google Cloud SDK uses a Docker image of Alpine Linux. However, at some point, I st…
In Google Cloud Storage, I tried to get a list of files in a subdirectory and process those files. However, when I looked at the error message that the file was wrong, I found that the list of files retrieved included a subdirectory. Reaso…
We previously used the GCP Pyhton library to get a list of Google Cloud Storage subdirectories, and this is the Node.js version of that. Python version is here www.ekwbtblog.com procedure The GCP Node.js SDK documentation is kind enough to…
How to access AWS S3 from Spark (Google Dataproc). procedure Spark Configuration The following Spark and Haddop settings will allow you to read and write AWS S3 files from Spark. Load the following AWS-related jar files into Spark aws-java…