s3:ListBucket
and s3:GetObject
for that bucket. Learn how.
s3:PutObject
for that bucket. Learn how.
protocol://bucket/
(for example, s3://my-bucket/
).
If the target files are in a folder, the path to the target folder in the S3 bucket, formatted as protocol://bucket/path/to/folder/
(for example, s3://my-bucket/my-folder/
).
root
to that
specific username.
In this policy, replace the following:
<my-account-id>
with your AWS account ID.<my-bucket-name>
in two places with the name of your bucket.create-s3-bucket.yaml
. To change
the following bucket policy to restrict it to a specific user in the AWS account, change root
to that
specific username.
create-s3-bucket.sh
.
To change the following bucket policy to restrict it to a specific user in the AWS account, change root
to that
specific username.
In this script, replace the following:
<my-account-id>
with your AWS account ID.<my-unique-bucket-name>
with the name of your bucket.<us-east-1>
with your AWS Region..parquet
) file per file in the source location. For example, for a file in the source location named my-file.pdf
, an associated
file with the extension .parquet
is generated. Various kinds of file transactions can result in additional Parquet files being generated. These Parquet filenames are automatically generated by the Delta Lake engine and are not meant to be manually modified._delta_log
that contains metadata and change history about the .parquet
files. As Parquet files are added to, changed, or removed from
the specified bucket or folder path, the _delta_log
folder is updated with any related metadata and change history details._delta_log
folder (and its contents) describe a single, versioned Delta table. Because of this, Unstructured recommends the following usage best practices:
_delta_log
folder within a Delta table’s directory. This can lead to data loss or table corruption._delta_log
folder (and its contents) together as a unit.
Note that the copied or moved Delta table will
no longer be controlled by the original Delta Tables in S3 destination connector.<name>
(required) - A unique name for this connector.<aws-region>
(required) - The AWS Region identifier (for example, us-east-1
) for the Amazon S3 bucket you want to store the Delta Table in.<table-uri>
(required) - The URI of the Amazon S3 bucket you want to store the Delta Table in. This typically takes the format s3://my-bucket/my-folder
.<aws-access-key-id>
(required) - The AWS access key ID for the AWS IAM principal (such as an IAM user) that has the appropriate access to the S3 bucket.<aws-secret-access-key>
(required) - The AWS secret access key for the corresponding AWS access key ID.