Amazon AWS Certified Big Data - Specialty Exam Practice Questions (P. 3)
- Full Access (85 questions)
- Six months of Premium Access
- Access to one million comments
- Seamless ChatGPT Integration
- Ability to download PDF files
- Anki Flashcard files for revision
- No Captcha & No AdSense
- Advanced Exam Configuration
Question #11
An administrator needs to manage a large catalog of items from various external sellers. The administrator needs to determine if the items should be identified as minimally dangerous, dangerous, or highly dangerous based on their textual descriptions. The administrator already has some items with the danger attribute, but receives hundreds of new item descriptions every day without such classification.
The administrator has a system that captures dangerous goods reports from customer support team of from user feedback.
What is a cost-effective architecture to solve this issue?
The administrator has a system that captures dangerous goods reports from customer support team of from user feedback.
What is a cost-effective architecture to solve this issue?
- ABuild a set of regular expression rules that are based on the existing examples, and run them on the DynamoDB Streams as every new item description is added to the system.
- BBuild a Kinesis Streams process that captures and marks the relevant items in the dangerous goods reports using a Lambda function once more than two reports have been filed.
- CBuild a machine learning model to properly classify dangerous goods and run it on the DynamoDB Streams as every new item description is added to the system.
- DBuild a machine learning model with binary classification for dangerous goods and run it on the DynamoDB Streams as every new item description is added to the system.
Correct Answer:
C
C
send
light_mode
delete
Question #12
A company receives data sets coming from external providers on Amazon S3. Data sets from different providers are dependent on one another. Data sets will arrive at different times and in no particular order.
A data architect needs to design a solution that enables the company to do the following:
✑ Rapidly perform cross data set analysis as soon as the data becomes available
✑ Manage dependencies between data sets that arrive at different times
Which architecture strategy offers a scalable and cost-effective solution that meets these requirements?
A data architect needs to design a solution that enables the company to do the following:
✑ Rapidly perform cross data set analysis as soon as the data becomes available
✑ Manage dependencies between data sets that arrive at different times
Which architecture strategy offers a scalable and cost-effective solution that meets these requirements?
- AMaintain data dependency information in Amazon RDS for MySQL. Use an AWS Data Pipeline job to load an Amazon EMR Hive table based on task dependencies and event notification triggers in Amazon S3.
- BMaintain data dependency information in an Amazon DynamoDB table. Use Amazon SNS and event notifications to publish data to fleet of Amazon EC2 workers. Once the task dependencies have been resolved, process the data with Amazon EMR.
- CMaintain data dependency information in an Amazon ElastiCache Redis cluster. Use Amazon S3 event notifications to trigger an AWS Lambda function that maps the S3 object to Redis. Once the task dependencies have been resolved, process the data with Amazon EMR.
- DMaintain data dependency information in an Amazon DynamoDB table. Use Amazon S3 event notifications to trigger an AWS Lambda function that maps the S3 object to the task associated with it in DynamoDB. Once all task dependencies have been resolved, process the data with Amazon EMR.
Correct Answer:
C
C
send
light_mode
delete
Question #13
A media advertising company handles a large number of real-time messages sourced from over 200 websites in real time. Processing latency must be kept low. Based on calculations, a 60-shard Amazon Kinesis stream is more than sufficient to handle the maximum data throughput, even with traffic spikes. The company also uses an Amazon Kinesis Client Library (KCL) application running on Amazon Elastic Compute Cloud (EC2) managed by an Auto Scaling group. Amazon CloudWatch indicates an average of 25% CPU and a modest level of network traffic across all running servers.
The company reports a 150% to 200% increase in latency of processing messages from Amazon Kinesis during peak times. There are NO reports of delay from the sites publishing to Amazon Kinesis.
What is the appropriate solution to address the latency?
The company reports a 150% to 200% increase in latency of processing messages from Amazon Kinesis during peak times. There are NO reports of delay from the sites publishing to Amazon Kinesis.
What is the appropriate solution to address the latency?
- AIncrease the number of shards in the Amazon Kinesis stream to 80 for greater concurrency.
- BIncrease the size of the Amazon EC2 instances to increase network throughput.
- CIncrease the minimum number of instances in the Auto Scaling group.
- DIncrease Amazon DynamoDB throughput on the checkpoint table.
Correct Answer:
D
D
send
light_mode
delete
Question #14
A Redshift data warehouse has different user teams that need to query the same table with very different query types. These user teams are experiencing poor performance.
Which action improves performance for the user teams in this situation?
Which action improves performance for the user teams in this situation?
- ACreate custom table views.
- BAdd interleaved sort keys per team.
- CMaintain team-specific copies of the table.
- DAdd support for workload management queue hopping.
Correct Answer:
D
Reference: https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html
D
Reference: https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html
send
light_mode
delete
Question #15
A company operates an international business served from a single AWS region. The company wants to expand into a new country. The regulator for that country requires the Data Architect to maintain a log of financial transactions in the country within 24 hours of the product transaction. The production application is latency insensitive. The new country contains another AWS region.
What is the most cost-effective way to meet this requirement?
What is the most cost-effective way to meet this requirement?
- AUse CloudFormation to replicate the production application to the new region.
- BUse Amazon CloudFront to serve application content locally in the country; Amazon CloudFront logs will satisfy the requirement.
- CContinue to serve customers from the existing region while using Amazon Kinesis to stream transaction data to the regulator.
- DUse Amazon S3 cross-region replication to copy and persist production transaction logs to a bucket in the new countrys region.
Correct Answer:
B
B
send
light_mode
delete
All Pages