As an alternative to creating the infrastructure manually, you can deploy the infrastructure in silent mode.
- Prerequisite: The same prerequisites that are required to deploy the infrastructure manually are also needed if you plan to deploy the infrastructure in silent mode, see Setup requirements.
- Prerequisite: To prepare for running a silent deployment, you must have completed steps 1 and 2 of Creating the infrastructure. You should also read step 3, and complete if required.
- Prerequisite: To Configure SSL, make sure you have a valid SSL certificate that has been imported by AWS into the Certificate Manager. See https://aws.amazon.com/certificate-manager/ for more information.
- Prerequisite: Make sure you create a Kubernetes cluster and also have 3 docker Repositories in AWS ECR registry. The Repositories must have the title - dqplus-main, dqplus-os and dqplus-extension.
-
Populate a variables file with the following required settings. Use the format
setting = "value"
, for exampleawsPrimaryRegion
primaryRegion
= "us-east-1"
:awsPrimaryRegion
primaryRegion = " "awsAccessKey = " "
awsSecretKey = " "
awsSecondaryRegion = " "
domainName = " "
vpcId = " "
privateSubnet1Id = " "
privateSubnet2Id = " "
deploymentId = " "
primaryS3Bucket = " "
availabilityZone1 = " "
availabilityZone2 = " "
localPathToEmrKey = " "
localPathToEc2Key = " "
s3KmsKey = " "
#Aurora
auroraPostgresPassword = " "
auroraDbInstanceClass = "db.r5.xlarge"
auroraBackupRetentionPeriod = "30"
auroraPreferredBackupWindow = "04:02-04:32"
auroraPreferredMaintenanceWindow = "sat:02:00-sat:02:30"
auroraSnapshotIdentifier = " "
auroraSecurityGroupId = " "
rdsSubnetGroupName = " "
postgresKmsKeyArn = " "
postgresEngineVersion = ""
#Redshift
redshiftEnabled = "true"
redshiftMasterPassword = " "
redshiftStandardSnapshotIdentifier = " "
redshiftHpSnapshotIdentifier = " "
redshiftClusterSubnetGroupName = " "
redshiftSecurityGroupId = " "
redshiftIamRoleArn = " "
redshiftEnabled = "true"
#EMR
emrEc2KeyName = " "
emrCoreInstanceCount = "3"
emrCoreInstanceType = "r5d.xlarge"
emrTaskInstanceCount = "1"
emrTaskInstanceType = "r5d.xlarge"
emrMasterInstanceCount = "3"
emrMasterInstanceType = "r5d.xlarge"
emrKmsKeyArn = " "
emrMasterSecurityGroupId = " "
emrSlaveSecurityGroupId = " "
emrServiceAccessSecurityGroupId = " "
emrInstanceProfileArn = " "
emrIamRoleArn = " "
emrCustomAmiId = " "
#EB
publicSubnet1Id = " "
publicSubnet2Id = " "
elasticBeanstalkPlatformName = "64bit Amazon Linux 2 v4.1.3 running Tomcat 9.0 Corretto 11"
ebMaxInstanceCount = "4"
ebMinInstanceCount = "2"
ebInstanceType = "c5.2xlarge"
ebKeyPairName = " "
ebAppCount = "1 "
sslConnectionStatus = "true"
httpsListenerEnable = "true"
httpListenerEnable = "false"
ebOrK8s = "0"
Note: TheebOrK8s
is working as a flag for elastic Beanstalk or Kubernetes deployment. IfebOrK8s=0
, it will deploy elastic beanstalk and if "ebOrK8s=1" then it will skip elastic beanstalk deployment and do the required configuration for Kubernetes deployment.sslCertificateArn = "arn:aws:acm:us-east-1:0517023248360:certificate/xxxxx-aabd-4134-a85e-1ea86xxxxxx"
Note:sslConnectionStatus
is working as a flag. But the other three values will get read into a terraform variable file. IfsslConnectionStatus = "true"
,httpsListenerEnable
should be"true"
,httpListenerEnable
should be"false"
, andsslCertificateArn
should set the value of ssl certificate. Similarly, IfsslConnectionStatus = "false"
,httpsListenerEnable
should be"false"
,httpListenerEnable
should be"true"
, andsslCertificateArn
can be blank""
.ebInstanceProfileArn = " "
ebSecurityGroupIds = " "
ebCustomAmiId = " "
Note: Where the above settings contain a value, for exampleebMaxInstanceCount = "4"
, this is the recommended default value which should be used in most cases. Please note that you must include all settings.Note: TheelasticBeanstalkPlatformName
value can change regularly due to the AWS release schedule. It is recommended that you verify the value for this setting from your AWS account by running the following AWS CLI command:aws elasticbeanstalk list-available-solution-stacks | grep "running Tomcat 9.0 Corretto 11"
Tip: During a manual deployment, the values for these settings are entered via prompts when running thepython3 setup.py
script.The following table provides additional information about each of the properties:Property Description awsPrimaryRegion
The AWS primary region, for example:
awsPrimaryRegion = "us-east-1"
primaryRegion
The AWS primary region, for example:
primaryRegion = "us-east-1"
awsAccessKey
The AWS access key that will be used for deployment, for example:
awsAccessKey = "AKIAIOSFODNN7EXAMPLE"
This must have Admin access for the duration of the installation. For more information, see the AWS documentation, for example at: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html
awsSecretKey
The AWS secret key that will be used for deployment, for example:
awsSecretKey = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
This must have Admin access for the duration of the installation. For more information, see the AWS documentation, for example at: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html
awsSecondaryRegion
The AWS secondary region, for example:
awsSecondaryRegion = "us-east-2"
domainName
The domain name for the installation, for example:
domainName = "infogix.com"
vpcId
The VPC ID of the Virtual Private Cloud to install into, for example:
vpcId = "vpc-6hj35709"
privateSubnet1Id
The subnet ID for the first private subnet, for example:
privateSubnet1Id = "subnet-07b3d0bf6h8op62ed"
privateSubnet2Id
The subnet ID for the second private subnet, for example:
privateSubnet2Id = "subnet-00ecf411w465df7ef"
deploymentId
The unique deployment ID associated with the AWS IAM user or AWS account, for example:
deploymentId = "dev"
primaryS3Bucket
The name for the S3 bucket to be used for the installation. availabilityZone1
The name of the first Availability Zone for the Aurora cluster. availabilityZone2
The name of the second Availability Zone for the Aurora cluster. localPathToEmrKey
The local path to the .pem file that can be used to login to the EMR primary server. This needs to be a local path on the VM from which the install is initiated.
localPathToEc2Key
The path to the Elastic Beanstalk key pair on the VM, for example:
localPathToEc2Key = "/home/ec2-user/engineering.pem"
s3KmsKey
The ARN of the KMS key used for S3. auroraPostgresPassword
The password for the Amazon Aurora account, for example:
auroraPostgresPassword = "Password1"
auroraDbInstanceClass
The Amazon Aurora database instance class. The default value is:For example:
auroraDbInstanceClass = "db.r5.xlarge"
For more information, see the AWS documentation, for example at:
https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/Concepts.DBInstanceClass.html
auroraBackupRetentionPeriod
The number of days to retain backup data. The default value is 30 days:For example, 30 days:
auroraBackupRetentionPeriod = "30"
auroraPreferredBackupWindow
Preferred time to perform backups. The default value is between 04:02 AM and 04:32 AM:For example, between 04:02 AM and 04:32 AM:
auroraPreferredBackupWindow = "04:02-04:32"
auroraPreferredMaintenanceWindow
Preferred time to perform maintenance. The default value is on a Saturday between 02:00 AM and 02:30 AM:For example, on a Saturday between 02:00 AM and 02:30 AM:
auroraPreferredMaintenanceWindow = "sat:02:00-sat:02:30"
auroraSnapshotIdentifier
Specifies whether or not to create this cluster from a snapshot. You can use either the name or ARN when specifying a DB cluster snapshot, or the ARN when specifying a DB snapshot. auroraSecurityGroupId
The security group ID to use for the Aurora database cluster. rdsSubnetGroupName
The subnet group name to be used for the Aurora database cluster. postgresKmsKeyArn
The ARN of the KMS key to be used for Aurora Postgres encryption. postgresEngineVersion
The version of the Aurora Engine to use. This is an optional value and if not included in the var file, the default of 11.9 will be used. redshiftEnabled
Determines whether to install the Redshift environment. If the value is set to '
true
', script creates redshift, if it is set to 'false
' redshift creation gets skipped.redshiftMasterPassword
The password for the Amazon Redshift database, for example:
redshiftMasterPassword = "Password1"
redshiftClusterSubnetGroupName
The name of the subnet group to use for the Redshift cluster. redshiftSecurityGroupId
The ID of the security group to use for Redshift access. redshiftIamRoleArn
The ARN of the IAM role to use for Redshift access. redshiftStandardSnapshotIdentifier
The name of the snapshot from which to create the new Standard Redshift cluster if restoring from an existing DB. redshiftHpSnapshotIdentifier
The name of the snapshot from which to create the new High Performance Redshift cluster if restoring from an existing DB. ebOrK8s
The ebOrK8s
is working as a flag for elastic Beanstalk or Kubernetes deployment. IfebOrK8s=0
, it will deploy elastic beanstalk and ifebOrK8s=1
then it will skip elastic beanstalk deployment and do the required configuration for Kubernetes deployment.publicSubnet1Id
If using public subnets, enter the first Subnet ID, alternatively leave as " " publicSubnet2Id
If using public subnets, enter the second Subnet ID, alternatively leave as " " elasticBeanstalkPlatformName
The Elastic Beanstalk platform name, for example:
elasticBeanstalkPlatformName = "64bit Amazon Linux 2 v4.1.3 running Tomcat 9.0 Corretto 11"
The
elasticBeanstalkPlatformName
value can change regularly due to the AWS release schedule. It is recommended that you verify the value for this setting from your AWS account by running the following AWS CLI command:aws elasticbeanstalk list-available-solution-stacks | grep "running Tomcat 9.0 Corretto 11"
ebMaxInstanceCount
The maximum number of EC2 instances in your Elastic Beanstalk environment. The default value is:For example:
ebMaxInstanceCount = "4"
ebMinInstanceCount
The minimum number of EC2 instances in your Elastic Beanstalk environment. The default value is:For example:
ebMinInstanceCount = "2"
ebInstanceType
The Elastic Beanstalk instance type. The default value is:For example:
ebInstanceType = "c5.2xlarge"
ebKeyPairName
The name of the AWS key pair for the Elastic Beanstalk (EB) instance, for example:
ebKeyPairName = "engineering"
This can be the same as the
emrEc2KeyName
, or it can be different.ebAppCount
Determines whether to install the Elastic Beanstalk environments into an existing application called
dqplus
. If you already have a Data360 DQ+ installation in AWS, the value should be0
. If this is your first implementation of Data360 DQ+, the value should be1
.The default value is:
ebAppCount = "1"
ebInstanceProfileArn
The ARN of the instance profile to assign to Elastic Beanstalk instances. ebSecurityGroupIds
The security group IDs separated by a comma to apply to Elastic Beanstalk instances. ebSecurityAmiId
The AMI ID to use for Elastic Beanstalk instances if you are using a custom image.
If you are not using a custom image, do not add this setting to the file.
emrEc2KeyName
The name of the AWS key pair for the Elastic Map Reduce (EMR) instance, for example:
emrEc2KeyName = "engineering"
This can be the same as the
ebKeyPairName
, or it can be different.emrCoreInstanceCount
The number of EMR core instances. The default value is:For example:
emrCoreInstanceCount = "3
"emrCoreInstanceType
The EMR core node type. Infogix supports the r5d instance range for this setting.
The default value is:For example:
emrCoreInstanceType = "r5d.xlarge"
emrTaskInstanceCount
The number of task nodes in your EMR cluster. The default value is:For example:
emrTaskInstanceCount = "1"
emrTaskInstanceType
The EMR task node type. Infogix supports the r5d instance range for this setting.
The default value is:For example:
emrTaskInstanceType = "r5d.xlarge"
emrMasterInstanceCount
The number of EMR Master instances.
The default value is
emrCoreInstanceCount = "3"
emrMasterInstanceType
The EMR primary node type. Infogix supports the r5d instance range for this setting.
The default value is:For example:
emrMasterInstanceType = "r5d.xlarge"
emrKmsKeyArn
The ARN of the KMS key to use for EMR. emrMasterSecurityGroupId
The ID of the security group to use for the EMR primary instance. emrSlaveSecurityGroupId
The ID of the security group to use for the EMR secondary instances. emrServiceAccessSecurityGroupId
The ID of the Amazon EC2 service-access security group. This is required when the cluster runs on a private subnet. emrInstanceProfileArn
The ARN of the instance profile to assign to EMR instances. emrIamRoleArn
The ARN of the IAM role to use for EMR access. emrCustomAmiId
The AMI ID to use for EMR instances if you are using a custom image.
If you are not using a custom image, do not add this setting to the file.
sslConnectionStatus
The sslConnectionStatus
will set the flag for ssl configure. The default value is :sslConnectionStatus = "false"
(for"http"
).httpsListenerEnable
The elastic beanstalk will set the httpsListenerEnable
value to set the Application load balancer listeners setting. The default value is :httpsListenerEnable = "false"
(sincesslConnectionStatus
is"false"
and it is for https configuration).httpListenerEnable
The elastic beanstalk will set the httpListenerEnable
value to set the Application load balancer listeners setting. The default value is :httpListenerEnable = "true"
(sincesslConnectionStatus
is"false"
and it is for http configuration).sslCertificateArn
The elastic beanstalk will set the ssl certificate value. The default value is : sslCertificateArn = ""
(since thesslConnectionStatus
is"false"
and it is for an HTTP configuration). -
Name the variables file
dqplus.tfvars
vars.auto.tfvars
and save it in theinfra/aws/config
directory. -
Create a new variables file called
dqplus.properties
and add the following property:OVERRIDES_FOLDER=<path to OVERRIDES_FOLDER>
For example:
OVERRIDES_FOLDER=/tmp/files/
-
From the
infra/aws/config
directory, run the following commands:terraform plan --target=module.buckets –-var-file=dqplus.tfvars
vars.auto.tfvars
--out=bucketplanterraform apply bucketplan
-
Upload the
emrBootstrap.sh
andemrBootstrap2.sh
files to the newly created bucket in the following location:/sparkshared/bootstrap/emrBootstrap.sh
and/sparkshared/bootstrap/emrBootstrap2.sh.
Tip: You can find theemrBootstrap.sh
file in theconfig
directory.Note: If you are using a custom EMR Bootstrap script, upload that one, not the file from theconfig
directory. See step 3 of Creating the infrastructure for more details. - Upload any self signed certificates that you need to install to the newly created bucket in the following location:
/sparkshared/certs/
- From the
infra/aws/config
directory, run the following commands:terraform plan --var-file=dqplus.tfvars
vars.auto.tfvars
out=planterraform apply plan
Note: Runningterraform apply plan
will create the infrastructure and will incur costs. Before executing this command, it is recommended that you first verify what will be created by runningterraform show plan
. - From the
infra/aws/config
directory, run the following commands:For EB deployment:
python3 ./properties.py
python3 ./password.py
For K8s deployment:
python3 ./properties.py
python3 ./password.py
python3 ./properties-ecr.py
python3 ./password-k8s.py
The properties.py
script parses data from the output of Terraform and creates a new properties file required by the installer to install the product.
The password.py
script parses data from the output of Terraform and creates a new file containing sensitive information required by the installer.
The properties-ecr.py
script parses data from the output of Terraform and creates a new properties file required by the installer for docker container registry (AWS ECR).
The properties-k8s.py
script parses data and edit new properties required by the installer for Kubernetes deployment.