Designing and Implementing a Data Science Solution on Azure
Exam Details
Exam Code
:DP-100
Exam Name
:Designing and Implementing a Data Science Solution on Azure
Certification
:Microsoft Certifications
Vendor
:Microsoft
Total Questions
:564 Q&As
Last Updated
:Mar 29, 2025
Microsoft Microsoft Certifications DP-100 Questions & Answers
Question 391:
You have a dataset that contains records of patients tested for diabetes. The dataset includes the patient's age.
You plan to create an analysis that will report the mean age value from the differentially private data derived from the dataset.
You need to identify the epsilon value to use in the analysis that minimizes the risk of exposing the actual data.
Which epsilon value should you use?
A. -1.5
B. -0.5
C. 0.5
D. 1.5
Correct Answer: C
Epsilon: Put simplistically, epsilon is a non-negative value that provides an inverse measure of the amount of noise added to the data. A low epsilon results in a dataset with a greater level of privacy, while a high epsilon results in a dataset that is closer to the original data. Generally, you should use epsilon values between 0 and 1. Epsilon is correlated with another value named delta, that indicates the probability that a report generated by an analysis is not fully private.
Note: Differential privacy tries to protect against the possibility that a user can produce an indefinite number of reports to eventually reveal sensitive data. A value known as epsilon measures how noisy, or private, a report is. Epsilon has an inverse relationship to noise or privacy. The lower the epsilon, the more noisy (and private) the data is.
What are two possible ways to achieve this goal? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.
A. MLflowClient.log_batch
B. mlflow.log_metrics
C. mlflow.log_metric
D. mlflow.log_param
Correct Answer: AB
Performance considerations: If you need to log multiple metrics (or multiple values for the same metric) avoid making calls to mlflow.log_metric in loops. Better performance can be achieved by logging batch of metrics. Use the method mlflow.log_metrics which accepts a dictionary with all the metrics you want to log at once or use MLflowClient.log_batch which accepts multiple type of elements for logging.
You need to run a pipeline that retrains the model based on a trigger from an external system.
What should you configure?
A. Azure Data Catalog
B. Azure Batch
C. Azure Logic App
Correct Answer: C
An Azure Logic App allows for more complex triggering logic or behavior.
To use an Azure Logic App to trigger a Machine Learning pipeline, you'll need the REST endpoint for a published Machine Learning pipeline. Create and publish your pipeline. Then find the REST endpoint of your PublishedPipeline by using
the pipeline ID.
# You can find the pipeline ID in Azure Machine Learning studio
Note: Azure Logic Apps is a cloud platform where you can create and run automated workflows with little to no code. By using the visual designer and selecting from prebuilt operations, you can quickly build a workflow that integrates and
manages your apps, data, services, and systems.
Azure Logic Apps simplifies the way that you connect legacy, modern, and cutting-edge systems across cloud, on premises, and hybrid environments and provides low-code-no-code tools for you to develop highly scalable integration
solutions for your enterprise and business-to-business (B2B) scenarios.
You create an Azure Machine Learning pipeline named pipeline1 with two steps that contain Python scripts. Data processed by the first step is passed to the second step.
You must update the content of the downstream data source of pipeline1 and run the pipeline again.
You need to ensure the new run of pipeline1 fully processes the updated content.
Solution: Set the regenerate_outputs parameter of the pipeline1 experiment's run submit method to True.
Does the solution meet the goal?
A. Yes
B. No
Correct Answer: A
There are a number of optional settings for a Pipeline which can be specified on submission in the submit.
regenerate_outputs: Whether to force regeneration of all step outputs and disallow data reuse for this run, default is False.
If False, this run may reuse results from previous runs and subsequent runs may reuse the results of this run.
You create an Azure Machine Learning workspace. The workspace contains a dataset named sample_dataset, a compute instance, and a compute cluster.
You must create a two-stage pipeline that will prepare data in the dataset and then train and register a model based on the prepared data.
The first stage of the pipeline contains the following code:
You need to identify the location containing the output of the first stage of the script that you can use as input for the second stage. Which storage location should you use?
A. workspaceblobstore datastore
B. workspacefilestore datastore
C. compute instance
D. compute_cluster
Correct Answer: A
The OutputFileDatasetConfig allows you to specify how you want a particular local path on the compute target to be uploaded to the specified destination.
Parameters Name
destination The destination to copy the output to. If set to None, we will copy the output to the workspaceblobstore datastore, under the path /dataset/{run-id}/{output-name}, where run-id is the Run's ID and the output-name is the output name from the name parameter above. T
You create an Azure Machine Learning workspace. You use Azure Machine Learning designer to create a pipeline within the workspace.
You need to submit a pipeline run from the designer.
What should you do first?
A. Create an experiment.
B. Create an attached compute resource.
C. Create a compute cluster.
D. Select a model.
Correct Answer: B
Create a new workspace (already done) Create the pipeline
1.
Sign in to ml.azure.com, and select the workspace you want to work with.
2.
Select Designer -> Classic prebuilt
3.
Select Create a new pipeline using classic prebuilt components.
4.
Click the pencil icon beside the automatically generated pipeline draft name, rename it to Automobile price prediction. The name doesn't need to be unique.
Set the default compute target
A pipeline jobs on a compute target, which is a compute resource that's attached to your workspace. After you create a compute target, you can reuse it for future jobs.
Note: Create a new workspace
You need an Azure Machine Learning workspace to use the designer. The workspace is the top-level resource for Azure Machine Learning, it provides a centralized place to work with all the artifacts you create in Azure Machine Learning.
You train and register an Azure Machine Learning model.
You plan to deploy the model to an online endpoint.
You need to ensure that applications will be able to use the authentication method with a non-expiring artifact to access the model.
Solution: Create a managed online endpoint and set the value of its auth_mode parameter to key. Deploy the model to the online endpoint.
Does the solution meet the goal?
A. Yes
B. No
Correct Answer: A
Authentication mode: The authentication method for the endpoint. Choose between key-based authentication and Azure Machine Learning token-based authentication. A key doesn't expire, but a token does expire.
You use the Azure Machine Learning SDK v2 for Python and notebooks to train a model. You use Python code to create a compute target, an environment, and a training script.
You need to prepare information to submit a training job.
Which class should you use?
A. MLClient
B. BuildContext
C. EndpointConnection
D. command
Correct Answer: D
Build the training job
Now that you have all the assets required to run your job, it's time to build it using the Azure Machine Learning Python SDK v2. For this, we'll be creating a command.
An Azure Machine Learning command is a resource that specifies all the details needed to execute your training code in the cloud. These details include the inputs and outputs, type of hardware to use, software to install, and how to run your
code. The command contains information to execute a single command.
Configure the command
You'll use the general purpose command to run the training script and perform your desired tasks. Create a Command object to specify the configuration details of your training job.
The inputs for this command include the number of epochs, learning rate, momentum, and output directory.
For the parameter values:
provide the compute cluster cpu_compute_target = "cpu-cluster" that you created for running this command;
provide the custom environment sklearn-env that you created for running the Azure Machine Learning job;
configure the command line action itself--in this case, the command is python train_iris.py. You can access the inputs and outputs in the command via the ${{ ... }} notation; and configure the metadata such as the display name and
experiment name; where an experiment is a container for all the iterations one does on a certain project. Note that all the jobs submitted under the same experiment name would be listed next to each other in Azure Machine Learning studio.
You build a custom model you must log with MLflow. The custom model includes the following:
The model is not natively supported by MLflow.
The model cannot be serialized in Pickle format.
The model source code is complex.
The Python library for the model must be packaged with the model.
You need to create a custom model flavor to enable logging with MLflow. What should you use?
A. model loader
B. artifacts
C. model wrapper
D. custom signatures
Correct Answer: C
If you didn't train your model with MLFlow and want to use Azure Machine Learning's MLflow no-code deployment offering, you need to convert your custom model to MLFLow.
Prerequisites
Only the mlflow package installed is needed to convert your custom models to an MLflow format.
1.
Create a Python wrapper for your model
Before you can convert your model to an MLflow supported format, you need to first create a Python wrapper for your model.
2.
Create a Conda environment
Next, you need to create Conda environment for the new MLflow Model that contains all necessary dependencies.
3.
Load the MLFlow formatted model and test predictions Once your environment is ready, you can pass the SKlearnWrapper, the Conda environment, and your newly created artifacts dictionary to the mlflow.pyfunc.save_model() method. Doing so saves the model to your disk. Note: MLflow provides support for a variety of machine learning frameworks (scikit-learn, Keras, Pytorch, and more); however, it might not cover every use case. For example, you may want to create an MLflow model with a framework that
MLflow does not natively support or you may want to change the way your model does pre-processing or post-processing when running jobs.
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-100 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.