Designing and Implementing a Data Science Solution on Azure
Exam Details
Exam Code
:DP-100
Exam Name
:Designing and Implementing a Data Science Solution on Azure
Certification
:Microsoft Certifications
Vendor
:Microsoft
Total Questions
:564 Q&As
Last Updated
:Mar 29, 2025
Microsoft Microsoft Certifications DP-100 Questions & Answers
Question 371:
You plan to build a team data science environment. Data for training models in machine learning pipelines will be over 20 GB in size. You have the following requirements:
1.
Models must be built using Caffe2 or Chainer frameworks.
2.
Data scientists must be able to use a data science environment to build the machine learning pipelines and train models on their personal devices in both connected and disconnected network environments.
Personal devices must support updating machine learning pipelines when connected to a network.
You need to select a data science environment.
Which environment should you use?
A. Azure Machine Learning Service
B. Azure Machine Learning Studio
C. Azure Databricks
D. Azure Kubernetes Service (AKS)
Correct Answer: A
The Data Science Virtual Machine (DSVM) is a customized VM image on Microsoft's Azure cloud built specifically for doing data science. Caffe2 and Chainer are supported by DSVM. DSVM integrates with Azure Machine Learning.
Incorrect Answers:
B: Use Machine Learning Studio when you want to experiment with machine learning models quickly and easily, and the built-in machine learning algorithms are sufficient for your solutions.
You manage an Azure Machine Learning workspace. The workspace includes an Azure Machine Learning Kubernetes compute target configured as an Azure Kubernetes Service (AKS) cluster named AKS1. AKS1 is configured to enable the
targeting of different nodes to train workloads.
You must run a command job on AKS1 by using the Azure ML Python SDK v2. The command job must select different types of compute nodes. The compute node types must be specified by using a command parameter.
You need to configure the command parameter.
Which parameter should you use?
A. environment
B. compute
C. limits
D. instance_type
Correct Answer: D
command
Create a Command object which can be used inside dsl.pipeline as a function and can also be created as a standalone command job.
Parameters include:
instance_type
Optional type of VM used as supported by the compute target.
Incorrect:
*
environment
the environment to use for this command
*
compute
the name of the compute where the command job is executed( will not be used if the command is used as a component/function)
You manage an Azure Machine Learning workspace named workspace1.
You must develop Python SDK v2 code to add a compute instance to workspace1. The code must import all required modules and call the constructor of the ComputeInstance class.
You need to add the instantiated compute instance to workspace1.
What should you use?
A. constructor of the azure.ai.ml.ComputeSchedule class
B. constructor of the azure.ai.ml.ComputePowerAction enum
C. begin_create_or_update method of an instance of the azure.ai.ml.MLCIient class
D. set_resources method of an instance of the azure.ai.ml.Command class
Correct Answer: C
Creating a compute instance is a one time process for your workspace. You can reuse the compute as a development workstation or as a compute target for training. You can have multiple compute instances attached to your workspace.
Example: # Compute Instances need to have a unique name across the region. # Here we create a unique name with current datetime from azure.ai.ml.entities import ComputeInstance, AmlCompute import datetime
You create a workspace to include a compute instance by using Azure Machine Learning Studio. You are developing a Python SDK v2 notebook in the workspace.
You need to use Intellisense in the notebook.
What should you do?
A. Stop the compute instance.
B. Start the compute instance.
C. Run a %pip magic function on the compute instance.
D. Run a !pip magic function on the compute instance.
Correct Answer: C
Manage packages
Since your compute instance has multiple kernels, make sure use %pip or %conda magic functions, which install packages into the currently running kernel. Don't use !pip or !conda, which refers to all packages (including packages outside
the currently running kernel).
Note: Code completion (IntelliSense)
IntelliSense is a code-completion aid that includes many features: List Members, Parameter Info, Quick Info, and Complete Word. With only a few keystrokes, you can:
You need to define an environment from a Docker image by using the Azure Machine Learning Python SDK v2.
Which parameter should you use?
A. properties
B. image
C. build
D. conda_file
Correct Answer: B
Create an environment from a Docker image
To define an environment from a Docker image, provide the image URI of the image hosted in a registry such as Docker Hub or Azure Container Registry.
The following example is a YAML specification file for an environment defined from a Docker image. An image from the official PyTorch repository on Docker Hub is specified via the image property in the YAML file.
You create an Azure Machine Learning managed compute resource. The compute resource is configured as follows:
Minimum nodes: 2 Maximum nodes: 4 You must decrease the minimum number of nodes and increase the maximum number of nodes to the following values: Minimum nodes: 0
Maximum nodes: 8
You need to reconfigure the compute resource.
Which three methods can you use? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.
A. Azure Machine Learning designer
B. MLClient class in Python SDK v2
C. Azure Machine Learning studio
D. Azure CLI ml extension v2
E. BuildContext class in Python SDK v2
Correct Answer: ACD
Incorrect:
*
MLClient Class
A client class to interact with Azure ML services.
Use this client to manage Azure ML resources, e.g. workspaces, jobs, models and so on.
You plan to use automated machine learning by using Azure Machine Learning Python SDK v2 to train a regression model. You have data that has features with missing values, and categorical features with few distinct values.
You need to control whether automated machine learning automatically imputes missing values and encode categorical features as part of the training task.
Which enum of the automl package should you use?
A. ForecastHorizonMode
B. RegressionModels
C. FeaturizationMode
D. RegressionPrimaryMetrics
Correct Answer: C
AutoMLConfig Class
Represents configuration for submitting an automated ML experiment in Azure Machine Learning.
This configuration object contains and persists the parameters for configuring the experiment run, as well as the training data to be used at run time.
Constructor
Featurization str or FeaturizationConfig
'auto' / 'off' / FeaturizationConfig Indicator for whether featurization step should be done automatically or not, or whether customized featurization should be used.
Column type is automatically detected. Based on the detected column type preprocessing/featurization is done as follows:
Categorical: Target encoding, one hot encoding, drop high cardinality categories, impute missing values.
*-> Numeric: Impute missing values, cluster distance, weight of evidence.
DateTime: Several features such as day, seconds, minutes, hours etc.
Text: Bag of words, pre-trained Word embedding, text target encoding.
You are developing a two-step Azure Machine Learning pipeline by using the Azure Machine Learning SDK for Python. You need to register the output of the pipeline as a new version of a named dataset after the run has been completed. What should you implement?
A. the as_input method of the OutputDatasetConfig class
B. the register_on_complete method of the OutputDatasetConfig class
C. the as_mount method of the DatasetConsumptionConfig class
D. the as_download method of the DatasetConsumptionConfig class
Correct Answer: A
The OutputDatasetConfig Class as_input method registers the output as a new version of a named Dataset after the run has ran.
If there are no datasets registered under the specified name, a new Dataset with the specified name will be registered. If there is a dataset registered under the specified name, then a new version will be added to this dataset.
Incorrect:
*
The OutputDatasetConfig Class as_input method specifies how to consume the output as an input in subsequent pipeline steps.
*
as_mount sets the mode to mount.
In the submitted run, files in the datasets will be mounted to local path on the compute target. The mount point can be retrieved from argument values and the input_datasets field of the run context.
*
as_download Set the mode to download.
In the submitted run, files in the dataset will be downloaded to local path on the compute target. The download location can be retrieved from argument values and the input_datasets field of the run context.
You are implementing hyperparameter tuning by using Bayesian sampling for a model training from a notebook. The notebook is in an Azure Machine Learning workspace that uses a compute cluster with 20 nodes.
The code implements Bandit termination policy with slack factor set to 0.2 and the HyperDriveConfig class instance with max_concurrent_runs set to 10.
You must increase effectiveness of the tuning process by improving sampling convergence.
You need to select which sampling convergence to use.
What should you select?
A. Set the value of slack factor of early_termination_policy to 0.9.
B. Set the value of max_concurrent_runs of HyperDriveConfig to 4.
C. Set the value of slack factor of early_termination_policy to 0.1.
D. Set the value of max_concurrent_runs of HyperDriveConfig to 20.
Correct Answer: B
Bayesian sampling
Bayesian sampling is based on the Bayesian optimization algorithm. It picks samples based on how previous samples did, so that new samples improve the primary metric.
The number of concurrent jobs has an impact on the effectiveness of the tuning process. A smaller number of concurrent jobs may lead to better sampling convergence, since the smaller degree of parallelism increases the number of jobs
Nowadays, the certification exams become more and more important and required by more and more enterprises when applying for a job. But how to prepare for the exam effectively? How to prepare for the exam in a short time with less efforts? How to get a ideal result and how to find the most reliable resources? Here on Vcedump.com, you will find all the answers. Vcedump.com provide not only Microsoft exam questions, answers and explanations but also complete assistance on your exam preparation and certification application. If you are confused on your DP-100 exam preparations and Microsoft certification application, do not hesitate to visit our Vcedump.com to find your solutions here.