Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure subscription that contains an Azure data factory named ADF1.From Azure Data Factory Studio, you build a complex data pipeline in ADF1.You discover that the Save button is unavailable, and there are validation errors that prevent the pipeline from being published.You need to ensure that you can save the logic of the pipeline.Solution: You disable all the triggers for ADF1.Does this meet the goal?
#61
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure subscription that contains the resources shown in the following table.You need to implement Azure Synapse Link for Azure SQL Database.Which two actions should you perform on sql1? Each correct answer presents a part of the solution.NOTE: Each correct selection is worth one point.
#62
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure subscription that contains an Azure Cosmos DB database. Azure Synapse Link is implemented on the database.You configure a full fidelity schema for the analytical store.You perform the following actions:• Insert {"customerID": 12, "customer": “Tailspin Toys"} as the first document in the container.• Insert {"customerID": "14", "customer": "Contoso"} as the second document in the container.How many columns will the analytical store contain?
#63
Answer: C✅ Correct❌ Incorrect
Microsoft's DP-203 You are developing a solution that will stream to Azure Stream Analytics. The solution will have both streaming data and reference data.Which input type should you use for the reference data?
#64
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure subscription that contains an Azure Synapse Analytics workspace and a user named User1.You need to ensure that User1 can create a new lake database by using an Azure Synapse database template from Gallery. The solution must follow the principle of least privilege.Which role should you assign to User1?
#65
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 A company has a real-time data analysis solution that is hosted on Microsoft Azure. The solution uses Azure Event Hub to ingest data and an Azure StreamAnalytics cloud job to analyze the data. The cloud job is configured to use 120 Streaming Units (SU).You need to optimize performance for the Azure Stream Analytics job.Which two actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. E. Scale the SU count for the job down. F. Implement query parallelization by partitioning the data input.
#66
Answer: C✅ Correct❌ Incorrect
Microsoft's DP-203 You need to trigger an Azure Data Factory pipeline when a file arrives in an Azure Data Lake Storage Gen2 container.Which resource provider should you enable?
#67
Answer: C✅ Correct❌ Incorrect
Microsoft's DP-203 You plan to perform batch processing in Azure Databricks once daily.Which type of Databricks cluster should you use?
#68
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure Data Factory instance that contains two pipelines named Pipeline1 and Pipeline2.Pipeline1 has the activities shown in the following exhibit.Pipeline2 has the activities shown in the following exhibit.You execute Pipeline2, and Stored procedure1 in Pipeline1 fails.What is the status of the pipeline runs?
#69
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure Data Factory that contains 10 pipelines.You need to label each pipeline with its main purpose of either ingest, transform, or load. The labels must be available for grouping and filtering when using the monitoring experience in Data Factory.What should you add to each pipeline?
#70
Answer: D✅ Correct❌ Incorrect
Microsoft's DP-203 You are designing a statistical analysis solution that will use custom proprietary Python functions on near real-time data from Azure Event Hubs.You need to recommend which Azure service to use to perform the statistical analysis. The solution must minimize latency.What should you recommend?
#71
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You need to implement a Type 3 slowly changing dimension (SCD) for product category data in an Azure Synapse Analytics dedicated SQL pool.You have a table that was created by using the following Transact-SQL statement.Which two columns should you add to the table? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point. E. [OriginalProductCategory] [nvarchar] (100) NOT NULL,
#72
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You are designing an Azure Stream Analytics solution that will analyze Twitter data.You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.Solution: You use a hopping window that uses a hop size of 10 seconds and a window size of 10 seconds.Does this meet the goal?
#73
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You are designing an Azure Stream Analytics solution that will analyze Twitter data.You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.Solution: You use a hopping window that uses a hop size of 5 seconds and a window size 10 seconds.Does this meet the goal?
#74
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You are creating an Azure Data Factory data flow that will ingest data from a CSV file, cast columns to specified types of data, and insert the data into a table in anAzure Synapse Analytic dedicated SQL pool. The CSV file contains three columns named username, comment, and date.The data flow already contains the following:✑ A source transformation.✑ A Derived Column transformation to set the appropriate types of data.✑ A sink transformation to land the data in the pool.You need to ensure that the data flow meets the following requirements:✑ All valid rows must be written to the destination table.✑ Truncation errors in the comment column must be avoided proactively.✑ Any rows containing comment values that will cause truncation errors upon insert must be written to a file in blob storage.Which two actions should you perform? Each correct answer presents part of the solution.NOTE: Each correct selection is worth one point.
#75
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure Storage account and a data warehouse in Azure Synapse Analytics in the UK South region.You need to copy blob data from the storage account to the data warehouse by using Azure Data Factory. The solution must meet the following requirements:✑ Ensure that the data remains in the UK South region at all times.✑ Minimize administrative effort.Which type of integration runtime should you use?
#76
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure Stream Analytics job that receives clickstream data from an Azure event hub.You need to define a query in the Stream Analytics job. The query must meet the following requirements:✑ Count the number of clicks within each 10-second window based on the country of a visitor.✑ Ensure that each click is NOT counted more than once.How should you define the Query?
#77
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You need to schedule an Azure Data Factory pipeline to execute when a new file arrives in an Azure Data Lake Storage Gen2 container.Which type of trigger should you use?
#78
Answer: D✅ Correct❌ Incorrect
Microsoft's DP-203 You have two Azure Data Factory instances named ADFdev and ADFprod. ADFdev connects to an Azure DevOps Git repository.You publish changes from the main branch of the Git repository to ADFdev.You need to deploy the artifacts from ADFdev to ADFprod.What should you do first?
#79
Answer: C✅ Correct❌ Incorrect
Microsoft's DP-203 You are developing a solution that will stream to Azure Stream Analytics. The solution will have both streaming data and reference data.Which input type should you use for the reference data?
#80
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You are designing an Azure Stream Analytics job to process incoming events from sensors in retail environments.You need to process the events to produce a running average of shopper counts during the previous 15 minutes, calculated at five-minute intervals.Which type of window should you use?
#81
Answer: C✅ Correct❌ Incorrect
Microsoft's DP-203 You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day.You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times.What should you include in the solution?
#82
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You have an Azure Databricks workspace named workspace1 in the Standard pricing tier.You need to configure workspace1 to support autoscaling all-purpose clusters. The solution must meet the following requirements:✑ Automatically scale down workers when the cluster is underutilized for three minutes.✑ Minimize the time it takes to scale to the maximum number of workers.✑ Minimize costs.What should you do first?
#83
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You are designing an Azure Stream Analytics solution that will analyze Twitter data.You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.Solution: You use a tumbling window, and you set the window size to 10 seconds.Does this meet the goal?
#84
Answer: A✅ Correct❌ Incorrect
Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You are designing an Azure Stream Analytics solution that will analyze Twitter data.You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.Solution: You use a session window that uses a timeout size of 10 seconds.Does this meet the goal?
#85
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You use Azure Stream Analytics to receive data from Azure Event Hubs and to output the data to an Azure Blob Storage account.You need to output the count of records received from the last five minutes every minute.Which windowing function should you use?
#86
Answer: D✅ Correct❌ Incorrect
Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:✑ A workload for data engineers who will use Python and SQL.✑ A workload for jobs that will run notebooks that use Python, Scala, and SQL.✑ A workload that data scientists will use to perform ad hoc analysis in Scala and R.The enterprise architecture team at your company identifies the following standards for Databricks environments:✑ The data engineers must share a cluster.✑ The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.✑ All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.You need to create the Databricks clusters for the workloads.Solution: You create a Standard cluster for each data scientist, a Standard cluster for the data engineers, and a High Concurrency cluster for the jobs.Does this meet the goal?
#87
Answer: B✅ Correct❌ Incorrect
Microsoft's DP-203 You have the following Azure Data Factory pipelines:✑ Ingest Data from System1✑ Ingest Data from System2✑ Populate Dimensions✑ Populate FactsIngest Data from System1 and Ingest Data from System2 have no dependencies. Populate Dimensions must execute after Ingest Data from System1 and IngestData from System2. Populate Facts must execute after Populate Dimensions pipeline. All the pipelines must execute every eight hours.What should you do to schedule the pipelines for execution?
#88
Answer: C✅ Correct❌ Incorrect
Microsoft's DP-203 You are monitoring an Azure Stream Analytics job by using metrics in Azure.You discover that during the last 12 hours, the average watermark delay is consistently greater than the configured late arrival tolerance.What is a possible cause of this behavior?
#89
Answer: D✅ Correct❌ Incorrect
Microsoft's DP-203 Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.You have an Azure Data Lake Storage account that contains a staging zone.You need to design a daily process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that copies the data to a staging table in the data warehouse, and then uses a stored procedure to execute the R script.Does this meet the goal?