Exam Details

  • Exam Code
    :DAS-C01
  • Exam Name
    :AWS Certified Data Analytics - Specialty (DAS-C01)
  • Certification
    :Amazon Certifications
  • Vendor
    :Amazon
  • Total Questions
    :285 Q&As
  • Last Updated
    :Apr 27, 2025

Amazon Amazon Certifications DAS-C01 Questions & Answers

  • Question 31:

    A company receives data in CSV format from partners. The company stores this incoming raw data in Amazon S3. The company must clean the data by addressing missing values, incorrect formatting, and outlier values before the company sends the data to a reporting dashboard.

    Which solution will meet these requirements with the LEAST development effort?

    A. Implement an AWS Glue ETL job. Include the data cleaning logic in the ETL job.

    B. Create an AWS Glue DataBrew recipe job. Include appropriate steps in the recipe job to detect and change specific data fields.

    C. Launch an Amazon EMR cluster. Run an Apache Spark job to read and clean the data. Include the data cleaning logic in the Spark job.

    D. Use an Amazon EMR serverless runtime. Run an Apache Spark job to read and clean the data. Include the data cleaning logic in the Spark job.

  • Question 32:

    A large company has several independent business units. Each business unit is responsible for its own data, but needs to share data with other units for collaboration. Each unit stores data in an Amazon S3 data lake created with AWS Lake Formation. To create dashboard reports, the marketing team wants to join its data stored in an Amazon Redshift cluster with the sales team customer table stored in the data lake. The sales team has a large number of tables and schemas, but the marketing team should only have access to the customer table. The solution must be secure and scalable.

    Which set of actions meets these requirements?

    A. The sales team shares the AWS Glue Data Catalog customer table with the marketing team in read-only mode using the named resource method. The marketing team accepts the datashare using AWS Resource Access Manager (AWS RAM) and creates a resource link to the shared customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.

    B. The marketing team creates an S3 cross-account replication between the sales team's S3 bucket as the source and the marketing team's S3 bucket as the destination. The marketing team runs an AWS Glue crawler on the replicated data in its AWS account to create an AWS Glue Data Catalog customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.

    C. The marketing team creates an AWS Lambda function in the sales team's account to replicate data between the sale team's S3 bucket as the source and the marketing team's S3 bucket as the destination. The marketing team runs an AWS Glue crawler on the replicated data in its AWS account to create an AWS Glue Data Catalog customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.

    D. The sales team shares the AWS Glue Data Catalog customer table with the marketing team in read-only mode using the Lake Formation tag-based access control (LF-TBAC) method. The sales team updates the AWS Glue Data Catalog resource policy to add relevant permissions for the marketing team. The marketing team creates a resource link to the shared customer table. The marketing team joins its data with the customer table using Amazon Redshift Spectrum.

  • Question 33:

    A data analytics specialist is maintaining a company's on-premises Apache Hadoop environment. In this environment, the company uses Apache Spark jobs for data transformation and uses Apache Presto for on-demand queries. The Spark

    jobs consist of many intermediate steps that require high-speed random I/O during processing. Some jobs can be restarted without losing the original data.

    The data analytics specialist decides to migrate the workload to an Amazon EMR cluster. The data analytics specialist must implement a solution that will scale the cluster automatically.

    Which solution will meet these requirements with the FASTEST I/O?

    A. Use Hadoop Distributed File System (HDFS). Configure the EMR cluster as an instance fleet with custom automatic scaling.

    B. Use EMR File System (EMRFS). Configure the EMR cluster as a uniform instance group with EMR managed scaling.

    C. Use Hadoop Distributed File System (HDFS). Configure the EMR cluster as an instance group with custom automatic scaling.

    D. Use EMR File System (EMRFS). Configure the EMR cluster as an instance fleet with custom automatic scaling.

  • Question 34:

    A company wants to use automatic machine learning (ML) to create and visualize forecasts of complex scenarios and trends. Which solution will meet these requirements with the LEAST management overhead?

    A. Use an AWS Glue ML job to transform the data and create forecasts. Use Amazon QuickSight to visualize the data.

    B. Use Amazon QuickSight to visualize the data. Use ML-powered forecasting in QuickSight to create forecasts.

    C. Use a prebuilt ML AMI from the AWS Marketplace to create forecasts. Use Amazon QuickSight to visualize the data.

    D. Use Amazon SageMaker inference pipelines to create and update forecasts. Use Amazon QuickSight to visualize the combined data.

  • Question 35:

    A company hosts a large data warehouse on Amazon Redshift. A business intelligence (BI) team requires access to tables in schemas A and B. However, the BI team must not have access to tables in schema C.

    Members of the BI team connect to the Redshift cluster through a client that uses a JDBC connector. A data analytics specialist needs to set up access for these users.

    Which combination of steps will meet these requirements? (Choose two.)

    A. Create an IAM user for each BI team member who requires access. Create an IAM group for these users.

    B. Create a database user for each BI team member who requires access. Create a database user group for these users.

    C. Create an IAM policy that grants read and write permissions for schemas A and B to the BI IAM group and denies read and write permissions for schema C to the BI IAM group. Attach the policy to the BI IAM group.

    D. Use the GRANT command to grant access to the BI database user group for schemas A and B. Use the REVOKE command to block access for the BI database user group for schema C.

    E. Specify the WITH MANAGED ACCESS parameter during the creation of schema C.

  • Question 36:

    An online retail company has an application that runs on Amazon EC2 instances launched in a VPC. The company wants to build a solution that allows the security team to collect VPC Flow Logs and analyze network traffic. Which solution MOST cost-effectively meets these requirements?

    A. Publish VPC Flow Logs to Amazon CloudWatch Logs and use Amazon Athena for analytics.

    B. Publish VPC Flow Logs to Amazon CloudWatch Logs and stream log data to an Amazon OpenSearch Service cluster for analytics.

    C. Publish VPC Flow Logs to Amazon S3 in text format and use Amazon Athena for analytics.

    D. Publish VPC Flow Logs to Amazon S3 in Apache Parquet format and use Amazon Athena for analytics.

  • Question 37:

    A company plans to store quarterly financial statements in a dedicated Amazon S3 bucket. The financial statements must not be modified or deleted after they are saved to the S3 bucket. Which solution will meet these requirements?

    A. Create the S3 bucket with S3 Object Lock in governance mode.

    B. Create the S3 bucket with MFA delete enabled.

    C. Create the S3 bucket with S3 Object Lock in compliance mode.

    D. Create S3 buckets in two AWS Regions. Use S3 Cross-Region Replication (CRR) between the buckets.

  • Question 38:

    A banking company plans to build a data warehouse solution on AWS to run join queries on 20 TB of data. These queries will be complex and analytical. About 10% of the data is from the past 3 months. Data older than 3 months needs to be accessed occasionally to run queries.

    Which solution MOST cost-effectively meets these requirements?

    A. Use Amazon S3 as the data store and use Amazon Athena for the queries. Use Amazon S3 Glacier Flexible Retrieval for storing data older than 3 months by using S3 lifecycle policies.

    B. Use Amazon Redshift to build a data warehouse solution. Create an AWS Lambda function that is orchestrated by AWS Step Functions to run the UNLOAD command on data older than 3 months from the Redshift database to Amazon S3. Use Amazon Redshift Spectrum to query the data in Amazon S3.

    C. Use Amazon Redshift to build a data warehouse solution. Use RA3 instances for the Redshift cluster so that data requested for a query is stored in a solid state drive (SSD) for fast local storage and Amazon S3 for longer-term durable storage.

    D. Use Amazon Elastic File System (Amazon EFS) to build a data warehouse solution for data storage. Use Amazon EFS lifecycle management to retire data older than 3 months to the S3 Standard-Infrequent Access (S3 Standard-IA) class. Use Apache Presto on an Amazon EMR cluster to query the data interactively.

  • Question 39:

    A company is running Apache Spark on an Amazon EMR cluster. The Spark job writes data to an Amazon S3 bucket and generates a large number of PUT requests. The number of objects has increased over time.

    After a recent increase in traffic, the Spark job started failing and returned an HTTP 503 Slow Down AmazonS3Exception error.

    Which combination of actions will resolve this error? (Choose two.)

    A. Increase the number of S3 key prefixes for the S3 bucket.

    B. Increase the EMR File System (EMRFS) retry limit.

    C. Disable dynamic partition pruning in the Spark configuration for the cluster.

    D. Increase the repartitioning number for the Spark job.

    E. Increase the executor memory size on Spark.

  • Question 40:

    A company stores financial performance records of its various portfolios in CSV format in Amazon S3. A data analytics specialist needs to make this data accessible in the AWS Glue Data Catalog for the company's data analysts. The data analytics specialist creates an AWS Glue crawler in the AWS Glue console.

    What must the data analytics specialist do next to make the data accessible for the data analysts?

    A. Create an IAM role that includes the AWSGlueExecutionRole policy. Associate the role with the crawler. Specify the S3 path of the source data as the crawler's data store. Create a schedule to run the crawler. Point to the S3 path for the output.

    B. Create an IAM role that includes the AWSGlueServiceRole policy. Associate the role with the crawler. Specify the S3 path of the source data as the crawler's data store. Create a schedule to run the crawler. Specify a database name for the output.

    C. Create an IAM role that includes the AWSGlueExecutionRole policy. Associate the role with the crawler. Specify the S3 path of the source data as the crawler's data store. Allocate data processing units (DPUs) to run the crawler. Specify a database name for the output.

    D. Create an IAM role that includes the AWSGlueServiceRole policy. Associate the role with the crawler. Specify the S3 path of the source data as the crawler's data store. Allocate data processing units (DPUs) to run the crawler. Point to the S3 path for the output.

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Amazon exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DAS-C01 exam preparations and Amazon certification application, do not hesitate to visit our Vcedump.com to find your solutions here.