You are designing a fact table named FactPurchase in an Azure Synapse Analytics dedicated SQL pool. The table contains purchases from suppliers for a retail store. FactPurchase will contain the following columns.
FactPurchase will have 1 million rows of data added daily and will contain three years of data. Transact-SQL queries similar to the following query will be executed daily. SELECT SupplierKey, StockItemKey, COUNT(*) FROM FactPurchase WHERE DateKey >= 20210101 AND DateKey <= 20210131 GROUP By SupplierKey, StockItemKey Which table distribution will minimize query times?
A. round-robin
B. replicated
C. hash-distributed on DateKey
D. hash-distributed on PurchaseKey
You plan to create an Azure Data Factory pipeline that will include a mapping data flow. You have JSON data containing objects that have nested arrays.
You need to transform the JSON-formatted data into a tabular dataset. The dataset must have one tow for each item in the arrays. Which transformation method should you use in the mapping data flow?
A. unpivot
B. flatten
C. new branch
D. alter row
You are designing the folder structure for an Azure Data Lake Storage Gen2 account. You identify the following usage patterns:
1.
Users will query data by using Azure Synapse Analytics serverless SQL pools and Azure Synapse Analytics serverless Apache Spark pods.
2.
Most queries will include a filter on the current year or week.
3.
Data will be secured by data source.
You need to recommend a folder structure that meets the following requirements:
1.
Supports the usage patterns
2.
Simplifies folder security
3.
Minimizes query times
Which folder structure should you recommend?
A. \DataSource\SubjectArea\YYYY\WW\FileData_YYYY_MM_DD.parquet
B. \DataSource\SubjectArea\YYYY-WW\FileData_YYYY_MM_DD.parquet
C. DataSource\SubjectArea\WW\YYYY\FileData_YYYY_MM_DD.parquet
D. \YYYY\WW\DataSource\SubjectArea\FileData_YYYY_MM_DD.parquet
E. WW\YYYY\SubjectArea\DataSource\FileData_YYYY_MM_DD.parquet
You have an Azure Storage account and a data warehouse in Azure Synapse Analytics in the UK South region.
You need to copy blob data from the storage account to the data warehouse by using Azure Data Factory. The solution must meet the following requirements:
1.
Ensure that the data remains in the UK South region at all times.
2.
Minimize administrative effort.
Which type of integration runtime should you use?
A. Azure integration runtime
B. Azure-SSIS integration runtime
C. Self-hosted integration runtime
You are designing a highly available Azure Data Lake Storage solution that will include geo-zone-redundant storage (GZRS).
You need to monitor for replication delays that can affect the recovery point objective (RPO).
What should you include in the monitoring solution?
A. 5xx: Server Error errors
B. Average Success E2E Latency
C. availability
D. Last Sync Time
You have an Azure Synapse Analytics dedicated SQL pool.
You run PDW_SHOWSPACEUSED('dbo.FactInternetSales'); and get the results shown in the following table.
Which statement accurately describes the dbo.FactInternetSales table?
A. All distributions data.
B. The table contains less than 10,000 rows.
C. The table uses round-robin distribution.
D. The table is skewed.
You have two fact tables named Flight and Weather. Queries targeting the tables will be based on the join between the following columns.
You need to recommend a solution that maximizes query performance. What should you include in the recommendation?
A. In the tables use a hash distribution of ArrivalDateTime and ReportDateTime.
B. In the tables use a hash distribution of ArrivalAirportID and AirportID.
C. In each table, create an IDENTITY column.
D. In each table, create a column as a composite of the other two columns in the table.
A company plans to use Apache Spark analytics to analyze intrusion detection data.
You need to recommend a solution to analyze network and system activity data for malicious activities and policy violations. The solution must minimize administrative efforts.
What should you recommend?
A. Azure Data Lake Storage
B. Azure Databncks
C. Azure HDInsight
D. Azure Data Factory
You manage an enterprise data warehouse in Azure Synapse Analytics.
Users report slow performance when they run commonly used queries. Users do not report performance changes for infrequently used queries.
You need to monitor resource utilization to determine the source of the performance issues.
Which metric should you monitor?
A. DWU percentage
B. Cache hit percentage
C. DWU limit
D. Data IO percentage
You have an Azure Databricks resource.
You need to log actions that relate to changes in compute for the Databricks resource.
Which Databricks services should you log?
A. clusters
B. workspace
C. DBFS
D. SSH
E. jobs
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-203 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.