MICROSOFT-DP203 – Practice Exam Questions (31

Microsoft's DP-203 You have an Azure Databricks workspace that contains a Delta Lake dimension table named Table1.Table1 is a Type 2 slowly changing dimension (SCD) table.You need to apply updates from a source table to Table1.Which Apache Spark SQL operation should you use?

#31

Microsoft's DP-203 You are designing an Azure Data Lake Storage solution that will transform raw JSON files for use in an analytical workload.You need to recommend a format for the transformed files. The solution must meet the following requirements:✑ Contain information about the data types of each column in the files.✑ Support querying a subset of columns in the files.✑ Support read-heavy analytical workloads.✑ Minimize the file size.What should you recommend?

#32

Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure Storage account that contains 100 GB of files. The files contain rows of text and numerical values. 75% of the rows contain description data that has an average length of 1.1 MB.You plan to copy the data from the storage account to an enterprise data warehouse in Azure Synapse Analytics.You need to prepare the files to ensure that the data copies quickly.Solution: You modify the files to ensure that each row is less than 1 MB.Does this meet the goal?

#33

Microsoft's DP-203 You plan to create a dimension table in Azure Synapse Analytics that will be less than 1 GB.You need to create the table to meet the following requirements:✑ Provide the fastest query time.✑ Minimize data movement during queries.Which type of table should you use?

#34

Microsoft's DP-203 You are designing a dimension table in an Azure Synapse Analytics dedicated SQL pool.You need to create a surrogate key for the table. The solution must provide the fastest query performance.What should you use for the surrogate key?

#35

Microsoft's DP-203 You have an Azure Synapse Analytics dedicated SQL pool.You need to create a fact table named Table1 that will store sales data from the last three years. The solution must be optimized for the following query operations:• Show order counts by week.• Calculate sales totals by region.• Calculate sales totals by product.• Find all the orders from a given month.Which data should you use to partition Table1?

#36

Microsoft's DP-203 You are designing the folder structure for an Azure Data Lake Storage Gen2 account.You identify the following usage patterns:• Users will query data by using Azure Synapse Analytics serverless SQL pools and Azure Synapse Analytics serverless Apache Spark pools.• Most queries will include a filter on the current year or week.• Data will be secured by data source.You need to recommend a folder structure that meets the following requirements:• Supports the usage patterns• Simplifies folder security• Minimizes query timesWhich folder structure should you recommend? E. WW\YYYY\SubjectArea\DataSource\FileData_YYYY_MM_DD.parquet

#37

Microsoft's DP-203 You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a table named table1.You load 5 TB of data into table1.You need to ensure that columnstore compression is maximized for table1.Which statement should you execute?

#38

Microsoft's DP-203 You have an Azure Synapse Analytics dedicated SQL pool named pool1.You plan to implement a star schema in pool and create a new table named DimCustomer by using the following code.You need to ensure that DimCustomer has the necessary columns to support a Type 2 slowly changing dimension (SCD).Which two columns should you add? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. E. [EffectiveStartDate] [datetime] NOT NULL

#39

Microsoft's DP-203 You have an Azure subscription that contains an Azure Data Lake Storage Gen2 account named account1 and an Azure Synapse Analytics workspace named workspace1.You need to create an external table in a serverless SQL pool in workspace1. The external table will reference CSV files stored in account1. The solution must maximize performance.How should you configure the external table?

#40

Microsoft's DP-203 You have an Azure Databricks workspace and an Azure Data Lake Storage Gen2 account named storage1.New files are uploaded daily to storage1.You need to recommend a solution that configures storage1 as a structured streaming source. The solution must meet the following requirements:• Incrementally process new files as they are uploaded to storage1.• Minimize implementation and maintenance effort.• Minimize the cost of processing millions of files.• Support schema inference and schema drift.Which should you include in the recommendation?

#41

Microsoft's DP-203 You have an Azure subscription that contains the resources shown in the following table.You need to read the TSV files by using ad-hoc queries and the OPENROWSET function. The solution must assign a name and override the inferred data type of each column.What should you include in the OPENROWSET function?

#42

Microsoft's DP-203 You have an Azure Synapse Analytics dedicated SQL pool.You plan to create a fact table named Table1 that will contain a clustered columnstore index.You need to optimize data compression and query performance for Table1.What is the minimum number of rows that Table1 should contain before you create partitions?

#43

Microsoft's DP-203 You have an Azure Synapse Analytics dedicated SQL pool that contains a table named DimSalesPerson. DimSalesPerson contains the following columns:• RepSourceID• SalesRepID• FirstName• LastName• StartDate• EndDate• RegionYou are developing an Azure Synapse Analytics pipeline that includes a mapping data flow named Dataflow1. Dataflow1 will read sales team data from an external source and use a Type 2 slowly changing dimension (SCD) when loading the data into DimSalesPerson.You need to update the last name of a salesperson in DimSalesPerson.Which two actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point.

#44

Microsoft's DP-203 You have an Azure Synapse Analytics workspace named WS1 that contains an Apache Spark pool named Pool1.You plan to create a database named DB1 in Pool1.You need to ensure that when tables are created in DB1, the tables are available automatically as external tables to the built-in serverless SQL pool.Which format should you use for the tables in DB1?

#45

Microsoft's DP-203 You have an Azure Data Lake Storage Gen2 account named storage1.You plan to implement query acceleration for storage1.Which two file types support query acceleration? Each correct answer presents a complete solution.NOTE: Each correct selection is worth one point. E. Avro

#46

Microsoft's DP-203 You have an Azure subscription that contains the resources shown in the following table.You need to read the files in storage1 by using ad-hoc queries and the OPENROWSET function. The solution must ensure that each rowset contains a single JSON record.To what should you set the FORMAT option of the OPENROWSET function?

#47

Microsoft's DP-203 You have an Azure subscription that contains an Azure Synapse Analytics workspace named ws1 and an Azure Cosmos DB database account named Cosmos1. Cosmos1 contains a container named container1 and ws1 contains a serverless SQL pool.You need to ensure that you can query the data in container1 by using the serverless SQL pool.Which three actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. E. Disable indexing for container1.

#48

Microsoft's DP-203 You have an Azure Data Factory pipeline named pipeline1.You need to execute pipeline1 at 2 AM every day. The solution must ensure that if the trigger for pipeline1 stops, the next pipeline execution will occur at 2 AM, following a restart of the trigger.Which type of trigger should you create?

#49

Microsoft's DP-203 You manage an enterprise data warehouse in Azure Synapse Analytics.Users report slow performance when they run commonly used queries. Users do not report performance changes for infrequently used queries.You need to monitor resource utilization to determine the source of the performance issues.Which metric should you monitor?

#50

Microsoft's DP-203 You have an Azure Synapse Analytics workspace that contains an Apache Spark pool named SparkPool1. SparkPool1 contains a Delta Lake table named SparkTable1.You need to recommend a solution that supports Transact-SQL queries against the data referenced by SparkTable1. The solution must ensure that the queries can use partition elimination.What should you include in the recommendation?

#51

Microsoft's DP-203 You are designing a sales transactions table in an Azure Synapse Analytics dedicated SQL pool. The table will contain approximately 60 million rows per month and will be partitioned by month. The table will use a clustered column store index and round-robin distribution.Approximately how many rows will there be for each combination of distribution and partition?

#52

Microsoft's DP-203 You have an Azure Synapse Analytics workspace.You plan to deploy a lake database by using a database template in Azure Synapse.Which two elements are included in the template? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. E. table definitions

#53

Microsoft's DP-203 You are implementing a star schema in an Azure Synapse Analytics dedicated SQL pool.You plan to create a table named DimProduct.DimProduct must be a Type 3 slowly changing dimension (SCD) table that meets the following requirements:• The values in two columns named ProductKey and ProductSourceID will remain the same.• The values in three columns named ProductName, ProductDescription, and Color can change.You need to add additional columns to complete the following table definition.Which three columns should you add? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. E. [OriginalColor] NVARCHAR(50) NOT NULL F. [OriginalProductName] NVARCHAR(100) NULL

#54

Microsoft's DP-203 You have an Azure subscription that contains an Azure Synapse Analytics serverless SQL pool.You execute the following query.Where will the rows returned by the query be stored?

#55

Microsoft's DP-203 You are deploying a lake database by using an Azure Synapse database template.You need to add additional tables to the database. The solution must use the same grouping method as the template tables.Which grouping method should you use?

#56

Microsoft's DP-203 You have an Azure data factory connected to a Git repository that contains the following branches:• main: Collaboration branch• abc: Feature branch• xyz: Feature branchYou save changes to a pipeline in the xyz branch.You need to publish the changes to the live service.What should you do first?

#57

Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure subscription that contains an Azure data factory named ADF1.From Azure Data Factory Studio, you build a complex data pipeline in ADF1.You discover that the Save button is unavailable, and there are validation errors that prevent the pipeline from being published.You need to ensure that you can save the logic of the pipeline.Solution: You enable Git integration for ADF1.Does this meet the goal?

#58

Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure subscription that contains an Azure data factory named ADF1.From Azure Data Factory Studio, you build a complex data pipeline in ADF1.You discover that the Save button is unavailable, and there are validation errors that prevent the pipeline from being published.You need to ensure that you can save the logic of the pipeline.Solution: You view the JSON code representation of the resource and copy the JSON to a file.Does this meet the goal?

#59

Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure subscription that contains an Azure data factory named ADF1.From Azure Data Factory Studio, you build a complex data pipeline in ADF1.You discover that the Save button is unavailable, and there are validation errors that prevent the pipeline from being published.You need to ensure that you can save the logic of the pipeline.Solution: You export ADF1 as an Azure Resource Manager (ARM) template.Does this meet the goal?

#60