Vcedump 100% Guareented DP-203 Questions and Answers. 100% Pass Guarantee. Latest Questions with Accurate Answers.

Exam Details

Exam Code
:DP-203
Exam Name
:Data Engineering on Microsoft Azure
Certification
:Microsoft Certifications
Vendor
:Microsoft
Total Questions
:398 Q&As
Last Updated
:Mar 30, 2025

Microsoft Microsoft Certifications DP-203 Questions & Answers

Question 241:

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while
others might not have a correct solution.
After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Storage account that contains 100 GB of files. The files contain text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.
You plan to copy the data from the storage account to an Azure SQL data warehouse.
You need to prepare the files to ensure that the data copies quickly.
Solution: You modify the files to ensure that each row is less than 1 MB.
Does this meet the goal?
A. Yes
B. No

Correct Answer: A
Polybase loads rows that are smaller than 1 MB.
Note on Polybase Load: PolyBase is a technology that accesses external data stored in Azure Blob storage or Azure Data Lake Store via the T-SQL language.
Extract, Load, and Transform (ELT)
Extract, Load, and Transform (ELT) is a process by which data is extracted from a source system, loaded into a data warehouse, and then transformed.
The basic steps for implementing a PolyBase ELT for dedicated SQL pool are:
Extract the source data into text files.
Land the data into Azure Blob storage or Azure Data Lake Store.
Prepare the data for loading.
Load the data into dedicated SQL pool staging tables using PolyBase.
Transform the data.
Insert the data into production tables.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-service-capacity-limits
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/load-data-overview
Question 242:

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.
You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.
You plan to insert data from the files in container1 into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1.
You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.
Solution: You use a dedicated SQL pool to create an external table that has an additional DateTime column.
Does this meet the goal?
A. Yes
B. No

Correct Answer: B
Instead use the derived column transformation to generate new columns in your data flow or to modify existing fields.
Reference: https://docs.microsoft.com/en-us/azure/data-factory/data-flow-derived-column
Question 243:

You use Azure Stream Analytics to receive Twitter data from Azure Event Hubs and to output the data to an Azure Blob storage account.
You need to output the count of tweets during the last five minutes every five minutes.
Each tweet must only be counted once.
Which windowing function should you use?
A. a five-minute Session window
B. B. a five-minute Sliding window
C. a five-minute Tumbling window
D. a five-minute Hopping window that has one-minute hop

Correct Answer: C
Tumbling window functions are used to segment a data stream into distinct time segments and perform a function against them, such as the example below. The key differentiators of a Tumbling window are that they repeat, do not overlap, and an event cannot belong to more than one tumbling window.
References: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-window-functions
Question 244:

You are designing a solution that will copy Parquet files stored in an Azure Blob storage account to an Azure Data Lake Storage Gen2 account.
The data will be loaded daily to the data lake and will use a folder structure of {Year}/{Month}/{Day}/.
You need to design a daily Azure Data Factory data load to minimize the data transfer between the two accounts.
Which two configurations should you include in the design? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
A. Delete the files in the destination before loading new data.
B. Filter by the last modified date of the source files.
C. Delete the source files after they are copied.
D. Specify a file naming pattern for the destination.

Correct Answer: BD
Copy data from one place to another. The requirements are : 1- need to minimize transfert and 2- need to adapte data to the destination folder structure. Filter on LastModifiedDate will copy everything that have changed since the latest load while minimizing the data transfert. Specifying the file naming pattern allows to copy data at the right place to the destination Data Lake.age
Question 245:

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Table1.
You have files that are ingested and loaded into an Azure Data Lake Storage Gen2 container named container1.
You plan to insert data from the files in container1 into Table1 and transform the data. Each row of data in the files will produce one row in the serving layer of Table1.
You need to ensure that when the source data files are loaded to container1, the DateTime is stored as an additional column in Table1.
Solution: In an Azure Synapse Analytics pipeline, you use a data flow that contains a Derived Column transformation.
Does this meet the goal?
A. Yes
B. No

Correct Answer: A
Use the derived column transformation to generate new columns in your data flow or to modify existing fields.
Reference: https://docs.microsoft.com/en-us/azure/data-factory/data-flow-derived-column
Question 246:

You have an Azure Synapse Analytics job that uses Scala.
You need to view the status of the job.
What should you do?
A. From Azure Monitor, run a Kusto query against the AzureDiagnostics table.
B. From Azure Monitor, run a Kusto query against the SparkLogying1 Event.CL table.
C. From Synapse Studio, select the workspace. From Monitor, select Apache Sparks applications.
D. From Synapse Studio, select the workspace. From Monitor, select SQL requests.

Correct Answer: C
Use Synapse Studio to monitor your Apache Spark applications. To monitor running Apache Spark application Open Monitor, then select Apache Spark applications. To view the details about the Apache Spark applications that are running, select the submitting Apache Spark application and view the details. If the Apache Spark application is still running, you can monitor the progress.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/monitoring/apache-spark-applications
Question 247:

You have an Azure Synapse Analytics dedicated SQL Pool1. Pool1 contains a partitioned fact table named dbo.Sales and a staging table named stg.Sales that has the matching table and partition definitions.
You need to overwrite the content of the first partition in dbo.Sales with the content of the same partition in stg.Sales. The solution must minimize load times.
What should you do?
A. Switch the first partition from dbo.Sales to stg.Sales.
B. Switch the first partition from stg.Sales to dbo. Sales.
C. Update dbo.Sales from stg.Sales.
D. Insert the data from stg.Sales into dbo.Sales.

Correct Answer: B
A way to eliminate rollbacks is to use Metadata Only operations like partition switching for data management. For example, rather than execute a DELETE statement to delete all rows in a table where the order_date was in October of 2001,
you could partition your data monthly. Then you can switch out the partition with data for an empty partition from another table
Note: Syntax:
SWITCH [ PARTITION source_partition_number_expression ] TO [ schema_name. ] target_table [ PARTITION target_partition_number_expression ]
Switches a block of data in one of the following ways:
1.
Reassigns all data of a table as a partition to an already-existing partitioned table.
2.
Switches a partition from one partitioned table to another.
3.
Reassigns all data in one partition of a partitioned table to an existing non-partitioned table.
Reference: https://docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool
Question 248:

A company purchases IoT devices to monitor manufacturing machinery. The company uses an IoT appliance to communicate with the IoT devices.
The company must be able to monitor the devices in real-time.
You need to design the solution.
What should you recommend?
A. Azure Stream Analytics cloud job using Azure PowerShell
B. Azure Analysis Services using Azure Portal
C. Azure Data Factory instance using Azure Portal
D. Azure Analysis Services using Azure PowerShell

Correct Answer: A
Stream Analytics is a cost-effective event processing engine that helps uncover real-time insights from devices, sensors, infrastructure, applications and data quickly and easily.
Monitor and manage Stream Analytics resources with Azure PowerShell cmdlets and powershell scripting that execute basic Stream Analytics tasks.
Reference:
https://cloudblogs.microsoft.com/sqlserver/2014/10/29/microsoft-adds-iot-streaming-analytics-data-production-and-workflow-services-to-azure/
Question 249:

You have an Azure Stream Analytics query. The query returns a result set that contains 10,000 distinct values for a column named clusterID.
You monitor the Stream Analytics job and discover high latency.
You need to reduce the latency.
Which two actions should you perform? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.
A. Add a pass-through query.
B. Add a temporal analytic function.
C. Scale out the query by using PARTITION BY.
D. Convert the query to a reference query.
E. Increase the number of streaming units.

Correct Answer: CE
C: Scaling a Stream Analytics job takes advantage of partitions in the input or output.
Partitioning lets you divide data into subsets based on a partition key. A process that consumes the data (such as a Streaming Analytics job) can consume and write different partitions in parallel, which increases throughput.
E: Streaming Units (SUs) represents the computing resources that are allocated to execute a Stream Analytics job. The higher the number of SUs, the more CPU and memory resources are allocated for your job. This capacity lets you focus
on the query logic and abstracts the need to manage the hardware to run your Stream Analytics job in a timely manner.
References: https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parallelization https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-streaming-unitconsumption
Question 250:

You are planning a streaming data solution that will use Azure Databricks. The solution will stream sales transaction data from an online store. The solution has the following specifications:
1.
The output data will contain items purchased, quantity, line total sales amount, and line total tax amount.
2.
Line total sales amount and line total tax amount will be aggregated in Databricks.
3.
Sales transactions will never be updated. Instead, new rows will be added to adjust a sale.
You need to recommend an output mode for the dataset that will be processed by using Structured Streaming. The solution must minimize duplicate data.
What should you recommend?
A. Append
B. Update
C. Complete

Correct Answer: B
By default, streams run in append mode, which adds new records to the table.
Incorrect Answers:
Complete mode: replace the entire table with every batch.
https://docs.databricks.com/delta/delta-streaming.html

Related Exams:

Tips on How to Prepare for the Exams

Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-203 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.

Exam Details

Exam Code

Exam Name

Certification

Vendor

Total Questions

Last Updated

Microsoft Microsoft Certifications DP-203 Questions & Answers

Question 241:

Question 242:

Question 243:

Question 244:

Question 245:

Question 246:

Question 247:

Question 248:

Question 249:

Question 250:

Related Exams:

62-193

70-243

70-355

77-420

77-427

77-725

77-726

77-727

77-728

77-731

Tips on How to Prepare for the Exams

Data Engineering on Microsoft Azure

Exam Details

Exam Code

Exam Name

Certification

Vendor

Total Questions

Last Updated

Microsoft Microsoft Certifications DP-203 Questions & Answers

Question 241:

Question 242:

Question 243:

Question 244:

Question 245:

Question 246:

Question 247:

Question 248:

Question 249:

Question 250:

Related Exams:

Tips on How to Prepare for the Exams