Amazon AWS Certified Big Data - Specialty Exam Practice Questions (P. 5)
- Full Access (85 questions)
- Six months of Premium Access
- Access to one million comments
- Seamless ChatGPT Integration
- Ability to download PDF files
- Anki Flashcard files for revision
- No Captcha & No AdSense
- Advanced Exam Configuration
Question #21
A data engineer is about to perform a major upgrade to the DDL contained within an Amazon Redshift cluster to support a new data warehouse application. The upgrade scripts will include user permission updates, view and table structure changes as well as additional loading and data manipulation tasks.
The data engineer must be able to restore the database to its existing state in the event of issues.
Which action should be taken prior to performing this upgrade task?
The data engineer must be able to restore the database to its existing state in the event of issues.
Which action should be taken prior to performing this upgrade task?
- ARun an UNLOAD command for all data in the warehouse and save it to S3.
- BCreate a manual snapshot of the Amazon Redshift cluster.
- CMake a copy of the automated snapshot on the Amazon Redshift cluster.
- DCall the waitForSnapshotAvailable command from either the AWS CLI or an AWS SDK.
Correct Answer:
B
Reference: https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-snapshots.html#working-with- snapshot-restore-table-from-snapshot
B
Reference: https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-snapshots.html#working-with- snapshot-restore-table-from-snapshot
send
light_mode
delete
Question #22
A large oil and gas company needs to provide near real-time alerts when peak thresholds are exceeded in its pipeline system. The company has developed a system to capture pipeline metrics such as flow rate, pressure, and temperature using millions of sensors. The sensors deliver to AWS IoT.
What is a cost-effective way to provide near real-time alerts on the pipeline metrics?
What is a cost-effective way to provide near real-time alerts on the pipeline metrics?
- ACreate an AWS IoT rule to generate an Amazon SNS notification.
- BStore the data points in an Amazon DynamoDB table and poll if for peak metrics data from an Amazon EC2 application.
- CCreate an Amazon Machine Learning model and invoke it with AWS Lambda.
- DUse Amazon Kinesis Streams and a KCL-based application deployed on AWS Elastic Beanstalk.
Correct Answer:
C
C
send
light_mode
delete
Question #23
A company is using Amazon Machine Learning as part of a medical software application. The application will predict the most likely blood type for a patient based on a variety of other clinical tests that are available when blood type knowledge is unavailable.
What is the appropriate model choice and target attribute combination for this problem?
What is the appropriate model choice and target attribute combination for this problem?
- AMulti-class classification model with a categorical target attribute.
- BRegression model with a numeric target attribute.
- CBinary Classification with a categorical target attribute.
- DK-Nearest Neighbors model with a multi-class target attribute.
Correct Answer:
A
A
send
light_mode
delete
Question #24
A data engineer is running a DWH on a 25-node Redshift cluster of a SaaS service. The data engineer needs to build a dashboard that will be used by customers. Five big customers represent 80% of usage, and there is a long tail of dozens of smaller customers. The data engineer has selected the dashboarding tool.
How should the data engineer make sure that the larger customer workloads do NOT interfere with the smaller customer workloads?
How should the data engineer make sure that the larger customer workloads do NOT interfere with the smaller customer workloads?
- AApply query filters based on customer-id that can NOT be changed by the user and apply distribution keys on customer-id.
- BPlace the largest customers into a single user group with a dedicated query queue and place the rest of the customers into a different query queue.
- CPush aggregations into an RDS for Aurora instance. Connect the dashboard application to Aurora rather than Redshift for faster queries.
- DRoute the largest customers to a dedicated Redshift cluster. Raise the concurrency of the multi-tenant Redshift cluster to accommodate the remaining customers.
Correct Answer:
D
D
send
light_mode
delete
Question #25
An Amazon Kinesis stream needs to be encrypted.
Which approach should be used to accomplish this task?
Which approach should be used to accomplish this task?
- APerform a client-side encryption of the data before it enters the Amazon Kinesis stream on the producer.
- BUse a partition key to segment the data by MD5 hash function, which makes it undecipherable while in transit.
- CPerform a client-side encryption of the data before it enters the Amazon Kinesis stream on the consumer.
- DUse a shard to segment the data, which has built-in functionality to make it indecipherable while in transit.
Correct Answer:
A
Reference: https://docs.aws.amazon.com/firehose/latest/dev/encryption.html
A
Reference: https://docs.aws.amazon.com/firehose/latest/dev/encryption.html
send
light_mode
delete
All Pages