Exam Details

  • Exam Code
    :DAS-C01
  • Exam Name
    :AWS Certified Data Analytics - Specialty (DAS-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :285 Q&As
  • Last Updated
    :Apr 27, 2025

Amazon Amazon Certifications DAS-C01 Questions & Answers

  • Question 11:

    A hospital uses an electronic health records (EHR) system to collect two types of data

    1.

    Patient information, which includes a patient's name and address

    2.

    Diagnostic tests conducted and the results of these tests Patient information is expected to change periodically Existing diagnostic test data never changes and only new records are added. The hospital runs an Amazon Redshift cluster with four dc2.large nodes and wants to automate the ingestion of the patient information and diagnostic test data into respective Amazon Redshift tables for analysis The EHR system exports data

    as CSV files to an Amazon S3 bucket on a daily basis Two sets of CSV files are generated One set of files is for patient information with updates, deletes, and inserts The other set of files is for new diagnostic test data only. What is the MOST cost-effective solution to meet these requirements?

    A. Use Amazon EMR with Apache Hudi. Run daily ETL jobs using Apache Spark and the Amazon Redshift JDBC driver

    B. Use an AWS Glue crawler to catalog the data in Amazon S3 Use Amazon Redshift Spectrum to perform scheduled queries of the data in Amazon S3 and ingest the data into the patient information table and the diagnostic tests table.

    C. Use an AWS Lambda function to run a COPY command that appends new diagnostic test data to the diagnostic tests table Run another COPY command to load the patient information data into the staging tables Use a stored procedure to handle create update, and delete operations for the patient information table

    D. Use AWS Database Migration Service (AWS DMS) to collect and process change data capture (CDC) records Use the COPY command to load patient information data into the staging tables. Use a stored procedure to handle create, update and delete operations for the patient information table

  • Question 12:

    A bank is using Amazon Managed Streaming for Apache Kafka (Amazon MSK) to populate real-time data into a data lake The data lake is built on Amazon S3, and data must be accessible from the data lake within 24 hours Different

    microservices produce messages to different topics in the cluster The cluster is created with 8 TB of Amazon Elastic Block Store (Amazon EBS) storage and a retention period of 7 days.

    The customer transaction volume has tripled recently and disk monitoring has provided an alert that the cluster is almost out of storage capacity.

    What should a data analytics specialist do to prevent the cluster from running out of disk space1?

    A. Use the Amazon MSK console to triple the broker storage and restart the cluster

    B. Create an Amazon CloudWatch alarm that monitors the KafkaDataLogsDiskUsed metric Automatically flush the oldest messages when the value of this metric exceeds 85%

    C. Create a custom Amazon MSK configuration Set the log retention hours parameter to 48 Update the cluster with the new configuration file

    D. Triple the number of consumers to ensure that data is consumed as soon as it is added to a topic.

  • Question 13:

    A utility company wants to visualize data for energy usage on a daily basis in Amazon QuickSight A data analytics specialist at the company has built a data pipeline to collect and ingest the data into Amazon S3 Each day the data is stored in an individual csv file in an S3 bucket This is an example of the naming structure 20210707_datacsv 20210708_datacsv.

    To allow for data querying in QuickSight through Amazon Athena the specialist used an AWS Glue crawler to create a table with the path "s3 //powertransformer/20210707_data csv" However when the data is queried, it returns zero rows.

    How can this issue be resolved?

    A. Modify the IAM policy for the AWS Glue crawler to access Amazon S3.

    B. Ingest the files again.

    C. Store the files in Apache Parquet format.

    D. Update the table path to "s3://powertransformer/".

  • Question 14:

    A company is reading data from various customer databases that run on Amazon RDS. The databases contain many inconsistent fields For example, a customer record field that is place_id in one database is location_id in another database. The company wants to link customer records across different databases, even when many customer record fields do not match exactly.

    Which solution will meet these requirements with the LEAST operational overhead?

    A. Create an Amazon EMR cluster to process and analyze data in the databases Connect to the Apache Zeppelin notebook, and use the FindMatches transform to find duplicate records in the data.

    B. Create an AWS Glue crawler to crawl the databases. Use the FindMatches transform to find duplicate records in the data Evaluate and tune the transform by evaluating performance and results of finding matches

    C. Create an AWS Glue crawler to crawl the data in the databases Use Amazon SageMaker to construct Apache Spark ML pipelines to find duplicate records in the data

    D. Create an Amazon EMR cluster to process and analyze data in the databases. Connect to the Apache Zeppelin notebook, and use Apache Spark ML to find duplicate records in the data. Evaluate and tune the model by evaluating performance and results of finding duplicates

  • Question 15:

    A bank wants to migrate a Teradata data warehouse to the AWS Cloud The bank needs a solution for reading large amounts of data and requires the highest possible performance. The solution also must maintain the separation of storage and compute.

    Which solution meets these requirements?

    A. Use Amazon Athena to query the data in Amazon S3

    B. Use Amazon Redshift with dense compute nodes to query the data in Amazon Redshift managed storage

    C. Use Amazon Redshift with RA3 nodes to query the data in Amazon Redshift managed storage

    D. Use PrestoDB on Amazon EMR to query the data in Amazon S3

  • Question 16:

    A software company wants to use instrumentation data to detect and resolve errors to improve application recovery time. The company requires API usage anomalies, like error rate and response time spikes, to be detected in near-real time (NRT) The company also requires that data analysts have access to dashboards for log analysis in NRT.

    Which solution meets these requirements'?

    A. Use Amazon Kinesis Data Firehose as the data transport layer for logging data Use Amazon Kinesis Data Analytics to uncover the NRT API usage anomalies Use Kinesis Data Firehose to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use OpenSearch Dashboards (Kibana) in Amazon OpenSearch Service (Amazon Elasticsearch Service) for the dashboards.

    B. Use Amazon Kinesis Data Analytics as the data transport layer for logging data. Use Amazon Kinesis Data Streams to uncover NRT monitoring metrics. Use Amazon Kinesis Data Firehose to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use Amazon QuickSight for the dashboards

    C. Use Amazon Kinesis Data Analytics as the data transport layer for logging data and to uncover NRT monitoring metrics Use Amazon Kinesis Data Firehose to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use OpenSearch Dashboards (Kibana) in Amazon OpenSearch Service (Amazon Elasticsearch Service) for the dashboards

    D. Use Amazon Kinesis Data Firehose as the data transport layer for logging data Use Amazon Kinesis Data Analytics to uncover NRT monitoring metrics Use Amazon Kinesis Data Streams to deliver log data to Amazon OpenSearch Service (Amazon Elasticsearch Service) for search, log analytics, and application monitoring Use Amazon QuickSight for the dashboards.

  • Question 17:

    A company is building an analytical solution that includes Amazon S3 as data lake storage and Amazon Redshift for data warehousing. The company wants to use Amazon Redshift Spectrum to query the data that is stored in Amazon S3. Which steps should the company take to improve performance when the company uses Amazon Redshift Spectrum to query the S3 data files? (Select THREE)

    A. Use gzip compression with individual file sizes of 1-5 GB

    B. Use a columnar storage file format

    C. Partition the data based on the most common query predicates

    D. Split the data into KB-sized files.

    E. Keep all files about the same size.

    F. Use file formats that are not splittable

  • Question 18:

    A company uses Amazon kinesis Data Streams to ingest and process customer behavior information from application users each day. A data analytics specialist notices that its data stream is throttling. The specialist has turned on enhanced monitoring for the Kinesis data stream and has verified that the data stream did not exceed the data limits. The specialist discovers that there are hot shards.

    Which solution will resolve this issue?

    A. Use a random partition key to ingest the records.

    B. Increase the number of shards Split the size of the log records.

    C. Limit the number of records that are sent each second by the producer to match the capacity of the stream.

    D. Decrease the size of the records that are sent from the producer to match the capacity of the stream.

  • Question 19:

    A company uses Amazon Redshift as its data warehouse. The Redshift cluster is not encrypted. A data analytics specialist needs to use hardware security module (HSM) managed encryption keys to encrypt the data that is stored in the Redshift cluster.

    Which combination of steps will meet these requirements? (Choose three.)

    A. Stop all write operations on the source cluster. Unload data from the source cluster.

    B. Copy the data to a new target cluster that is encrypted with AWS Key Management Service (AWS KMS).

    C. Modify the source cluster by activating AWS CloudHSM encryption. Configure Amazon Redshift to automatically migrate data to a new encrypted cluster.

    D. Modify the source cluster by activating encryption from an external HSM. Configure Amazon Redshift to automatically migrate data to a new encrypted cluster.

    E. Copy the data to a new target cluster that is encrypted with an HSM from AWS CloudHSM.

    F. Rename the source cluster and the target cluster after the migration so that the target cluster is using the original endpoint.

  • Question 20:

    An IoT company wants to release a new device that will collect data to track overnight sleep on an intelligent mattress. Sensors will send data that will be uploaded to an Amazon S3 bucket. Each mattress generates about 2 MB of data each night.

    An application must process the data and summarize the data for each user. The application must make the results available as soon as possible. Every invocation of the application will require about 1 GB of memory and will finish running within 30 seconds.

    Which solution will run the application MOST cost-effectively?

    A. AWS Lambda with a Python script

    B. AWS Glue with a Scala job

    C. Amazon EMR with an Apache Spark script

    D. AWS Glue with a PySpark job

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DAS-C01 exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.